Lanyun's Homepage

Lanyun Zhu 祝澜耘

Ph.D. Student
Singapore University of Technology and Design (SUTD)

Email: [email protected]

About Me

I'm a final-year PhD student (2021-) in Singapore University of Technology and Design (SUTD), advised by Professor Jun Liu and Professor Soh De Wen. Before that, I got the bachelor’s degree in June, 2020 from Beihang University. I also spent some wonderful times in Megvii, Sensetime and Johns Hopkins University. Currently, I am supported by AISG PhD Fellowship Programme (one of the top scholarships in Singapore for AI research). I work closely with NVIDIA, Alibaba and Tencent.

My research directions are multimodal learning and computer vision. Currently, most of my works are focused multimodal large language models (MLLMs) and image segmentation. My research goal is to build efficient, trustworthy, and fine-grained multimodal systems that can process or integrate information from diverse modalities—such as text, images, videos, and data from other sensors—to effectively address a wide range of real-world industrial and scientific challenges. I believe that a practical multimodal system should be cheap—with lower training and deployment costs; powerful—with more comprehensive and fine-grained capabilities; and reliable—with stronger robustness and minimal instability. Currently, I am exploring new techniques to achieve these goals within MLLMs and to advance their applications in real-world industrial scenarios, such as online content safety, as well as in scientific domains such as healthcare and agriculture.

I am always open to research collaborations. Please feel free to drop me an email if you are interested.

Resume

[English Resume] [中文简历]

News

(Mar 2025) One paper accepted to CVPR 2025 !

(Jan 2025) I pass my PhD final oral defense!

(Mar 2024) Two paper accepted to CVPR 2024 !

(Jul 2023) One paper accepted to ICCV 2023 !

(Apr 2023) We release the paper and code for SAM-Adapter, a pioneering attempt to finetune SAM !

(Mar 2023) One paper accepted to CVPR 2023 !

(Mar 2021) One paper accepted to CVPR 2021 !

Selected Publications

* refers to equal contribution

Conference Papers

[CVPR2025] Lanyun Zhu, Tianrun Chen, Qianxiong Xu, Xuanyi Liu, Deyi Ji, Haiyang Wu, De Wen Soh, Jun Liu, POPEN: Preference-Based Optimization and Ensemble for LVLM-Based Reasoning Segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2025
[NeurIPS2024] Qianxiong Xu, Xuanyi Liu, Lanyun Zhu, Guosheng Lin, Cheng Long, Ziyue Li, Rui Zhao, Hybrid Mamba for Few-Shot Segmentation, Annual Conference on Neural Information Processing Systems (NeurIPS) 2024 [paper]
[ICML2024] Deyi Ji, Feng Zhao, Lanyun Zhu, Wenwei Jin, Hongtao Lu, Jieping Ye, Discrete Latent Perspective Learning for Segmentation and Detection, International Conference on Machine Learning (ICML) 2024 (spotlight) [paper]
[CVPR2024] Lanyun Zhu, Tianrun Chen, Deyi Ji, Jieping Ye, Jun Liu, LLaFS: When Large Language Models Meet Few-Shot Segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024 [paper]
[CVPR2024] Lanyun Zhu, Tianrun Chen, Jianxiong Yin, Simon See, Jun Liu, Addressing Background Context Bias in Few-Shot Segmentation through Iterative Modulation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2024 [paper]
[ICCV2023] Lanyun Zhu, Tianrun Chen, Jianxiong Yin, Simon See, Jun Liu, Learning Gabor Texture Features for Fine-Grained Recognition, International Conference on Computer Vision (ICCV) 2023 [paper]
[CVPR2023] Lanyun Zhu*, Tianrun Chen*, Jianxiong Yin, Simon See, Jun Liu, Continual Semantic Segmentation with Automatic Memory Sample Selection, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2023 [paper]
[CVPR2021] Lanyun Zhu*, Deyi Ji*, Shiping Zhu, Weihao Gan, Wei Wu, Junjie Yan, Learning Statistical Texture for Semantic Segmentation, IEEE Conference on Computer Vision and Pattern Recognition (CVPR) 2021 [paper]
[ICCV2023 Workshop] Tianrun Chen*, Lanyun Zhu*, Chaotao Ding, Runlong Cao, Shangzhan Zhang, Yan Wang, Zejian Li, Lingyun Sun, Papa Mao, Ying Zang, SAM-Adapter: Adapting Segment Anything in Underperformed Scenes, ICCV2023 1st Workshop on Visual Continual Learning [paper] [code] (Github 1000+ Stars)
[3DV2022] Xiao Fu, Shangzhan Zhang, Tianrun Chen, Yichong Lu, Lanyun Zhu, Xiaowei Zhou, Andreas Geiger, Yiyi Liao., Panoptic NeRF: 3D-to-2D Label Transfer for Panoptic Urban Scene Segmentation, International Conference on 3D Vision (3DV) 2021 [paper]

Journal Papers

[TIP] Lanyun Zhu, Tianrun Chen, Deyi Ji, Jieping Ye, Jun Liu, Not Every Patch is Needed: Towards a More Efficient and Effective Backbone for Video-based Person Re-identification, IEEE Transactions on Image Processing
[TMM] Tianrun Chen, Chaotao Ding, Lanyun Zhu, Ying Zang, Yiyi Liao, Zejian Li, and Lingyun Su, Reality3DSketch: Rapid 3D Modeling of Objects from Single Free-hand Sketches, IEEE Transactions on Multimedia
[TMI] Yan Wang, Jian Cheng, Yixin Chen, Shuai Shao, Lanyun Zhu, Zhenzhou Wu, Tao Liu, Haogang Zhu, FVP: Fourier Visual Prompting for Source-Free Unsupervised Domain Adaptation of Medical Image Segmentation, IEEE Transactions on Medical Imaging [paper]

Preprint Papers

Tianrun Chen, Chunan Yu, Jing Li, Jiangqi Zhang, Lanyun Zhu, Deyi Ji, Yong Zhang, Ying Zang, Zejian Li, Lingyun Sun, Reasoning3D - Grounding and Reasoning in 3D: Fine-Grained Zero-Shot Open-Vocabulary 3D Reasoning Part Segmentation via Large Vision-Language Models, Arxiv Preprint [paper]

Note: * indicates equal contribution.

Full list of publications in Google Scholar.

Experiences

Research Intern | CCVL Lab, Johns Hopkins University| Apr 2020 - Sep 2021
Mentor: Dr. Yingwei Li | Leader: Prof.Alan Yuille
Research Intern | Sensetime, Beijing, China | June 2020 - July 2021
Mentor: Mr. Deyi Ji | Mr. Wei Wu
Research Intern | Megvii, Beijing, China | Sep. 2019 - May 2020
Mentor: Dr. Zhikang Liu | Leader: Dr. Chi Zhang

Service

Conference Reviewer: CVPR, ICML, NeurIPS, ICLR, ECCV, AAAI, ACM MM

Journal Reviewer: IEEE TPAMI, IJCV, IEEE TIP, IEEE TMM, IEEE TCSVT, IEEE TII, IEEE TIM, PR

Organizor: ICME2024 Grand Challenge – The 2nd Multi-Modal Video Reasoning and Analyzing Competition