Publications

FigureUniF^2ace: Fine-grained Face Understanding and Generation with Unified Multimodal Models
Junzhe Li, Xuerui Qiu, Linrui Xu, Liya Guo, Delin Qu, Tingting Long, Chun Fan and Ming Li*. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureTextSplat: Text-Guided Semantic Fusion for Generalizable Gaussian Splatting
Zhicong Wu, Hongbin Xu, Gang Xu, Ping Nie, Zhixin Yan, Jinkai Zheng, Liangqiong Qu, Ming Li* and Liqiang Nie. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureFaVChat: Unlocking Fine-Grained Facail Video Understanding with Multimodal Large Language Models
Fufangchen Zhao, Ming Li*†, Linrui Xu, Wenhao Jiang, Jian Gao and Danfeng Yan. Under Review. 2025. (*Corresponding Authors, †Equal Contributors)
[Paper][Project]

FigurePVChat: Personalized Video Chat with One-Shot Learning
Yufei Shi†, Weilong Yan†, Gang Xu, Yumeng Li, Yucheng Chen, Zhenxi Li, Fei Richard Yu, Ming Li* and Si Yong Yeo. Under Review. 2025. (*Corresponding Authors, †Equal Contributors)
[Paper][Project]

FigureSafe-VAR: Safe Visual Autoregressive Model for Text-to-Image Generative Watermarking
Ziyi Wang, Songbai Tan, Gang Xu, Xuerui Qiu, Hongbin Xu, Xin Meng, Ming Li* and Fei Richard Yu. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureStyleMe3D: Stylization with Disentangled Priors by Multiple Encoders on 3D Gaussians
Cailin Zhuang, Yaoqi Hu, Xuanyang Zhang, Wei Cheng, Jiacheng Bao, Shengqi Liu, Yiying Yang, Xianfang Zeng, Gang Yu and Ming Li*. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureX-SGS: Safe and Generalizable Gaussian Splatting with X-dimensional Watermarks
Zihang Cheng, Huiping Zhuang, Chun Li, Xin Meng, Ming Li* and Fei Richard Yu. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureInter3D: A Benchmark and Strong Baseline for Human-Interactive 3D Object Reconstruction
Gan Chen, Ying He, Mulin Yu, F.Richard Yu, Gang Xu, Fei Ma, Ming Li* and Guang Zhou. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureWMarkGPT: Watermarked Image Understanding via Multimodal Large Language Models
Songbai Tan, Yao Shu, Xuerui Qiu, Gang Xu, Linrui Xu, Xiangyu Xu, Huiping Zhuang, Ming Li* and Fei Richard Yu. Under Review. 2025. (*Corresponding Authors)
[Paper][Project]

FigureEFTViT: Efficient Federated Training of Vision Transformers with Masked Images on Resource-Constrained Edge Devices
Meihan Wu, Tao Chang, Miaocui, Jie Zhou, Chun Li, Xiangyu Xu, Ming Li*, and Xiaodong Wang. Under Review. 2024. (*Corresponding Authors)
[Paper][Project]

Figure SE-MOT: End-to-End Multi-Object Tracking in Low-Quality Video Scenes Guided by Semantic Enhancement
Jun Du , Weiwei Xing, Ming Li*, and Fei Richard Yu. Under Review. 2024. (*Corresponding Authors)
[Paper][Project]

Figure Resilient Missing-Modality MRI Segmentation Based on Mamba State-Space Modeling and Information-Theoretic Criteria
Runze Cheng, Xihang Qiu, Jiarong Cheng, Xiangyu Xue, Ye Zhang, Chun Li, Ming Li*, and Fei Richard Yu. Under Review. 2024. (*Corresponding Authors)
[Paper][Project]

Figure Robust Brain Tumor Segmentation with Incomplete MRI Modalities Using H¨older Divergence and Mutual Information-Enhanced Knowledge Transfer
Runze Cheng†, Xihang Qiu†, Ming Li†, Ye Zhang, Fei Richard Yu, and Chun Li. Under Review. 2024. (†Equal Contributors)
[Paper][Project]

Figure Uncertainty Quantification for Incomplete Multi-View Data Using Divergence Measures
Zhipeng Xue†, Yan Zhang†, Ming Li†, Chun Li, Yue Liu, and Fei Richard Yu. Under Review. 2024. (†Equal Contributors)
[Paper][Project]

Figure ColonNeRF: Neural Radiance Fields for High-Fidelity Long-Sequence Colonoscopy Reconstruction
Yufei Shi, Beijia Lu, Jia-Wei Liu, Ming Li and Mike Zheng Shou. arXiv. 2023.
[Paper][Project]

FigureEventGPT: Event Stream Understanding with Multimodal Large Language Models
Shaoyu liu, Jianing Li, Guanghui Zhao, Yunjian Zhang, Xin Meng, Fei Richard Yu, Xiangyang Ji, and Ming Li*. CVPR. 2025. (*Corresponding Authors)
[Paper][Project]

FigureSynthetic-to-Real Self-supervised Robust Depth Estimation via Learning with Motion and Structure Priors
Weilong Yan, Ming Li*, Haipeng Li, Shuwei Shao, Robby T. Tan. CVPR. 2025. (*Corresponding Authors)
[Paper][Project]

Figure Uncertainty Quantification via Holder Divergence for Multi-View Representation Learning
Yan Zhang†, Ming Li†, Chun Li, Zhaoxia Liu, Ye Zhang, and Fei Richard Yu. IEEE TMM. 2025. (†Equal Contributors)
[Paper][Project]

Figure DEP-SLAM: A Dynamic Environment Perception SLAM System with Large Language Models
Ying He, F. Richard Yu, Fei Ma, Ming Li, and Guang Zhou. ICASSP. 2025.
[Paper][Project]

Figure Corer: Concept Residue Erasing in Text-to-Image Diffusion Models
Yufan Liu, Jinyang An, Wanqian Zhang, Ming Li*, Dayan Wu, Jingzi Gu, Zheng Lin and Weiping Wang. ICME. 2025. (*Corresponding Authors)
[Paper][Project]

Figure LV-VTON: Long-Video Virtual Try-On via Enhanced Visual Autoregressive Modeling
Lulu Tian, Hongxun Yao and Ming Li. ICME. 2025.
[Paper][Project]

Figure OmniStyle: Attention-Optimized Global and Local Image Stylization with Diffusion Model Inversion
Jiarong Cheng, Xihang Qiu, Qing Zhou, Ming Li*, Chun Li*, Yao Lu and Fei Richard Yu. ICME. 2025. (*Corresponding Authors)
[Paper][Project]

Figure Semi-Supervised Disease Classification based on Limited Medical Image Data
Yan Zhang, Zhaoxia Liu, Chun Li and Ming Li. Journal of Biomedical and Health Informatics. 2024.
[Paper][Code]

Figure Instant3D: Instant Text-to-3D Generation
Ming Li, Pan Zhou, Jia-Wei Liu, Jussi Keppo, Min Lin, Shuicheng Yan and Xiangyu Xu. International Journal of Computer Vision. 2024.
[Paper][Project]

Figure FakePoI: A Large-scale Fake Person of Interest Video Detection Benchmark and a Strong Baseline
Lulu Tian, Hongxun Yao, and Ming Li. IEEE TCSVT. 2023.
[Paper][Code]

Figure DR-FER: Discriminative and Robust Representation Learning for Facial Expression Recognition
Ming Li, Huazhu Fu, Shengfeng He, Hehe Fan, Jun Liu, Jussi Keppo and Mike Zheng Shou. IEEE TMM. 2023.
[Paper][Code]

Figure STPrivacy: Spatio-Temporal Privacy-Preserving Action Recognition
Ming Li, Xiangyu Xu, Hehe Fan, Pan Zhou, Jun Liu, Jia-Wei Liu, Jiahe Li, Jussi Keppo, Mike Zheng Shou, and Shuicheng Yan. ICCV. 2023.
[Paper][Code]

Figure Exploiting Multi-view Part-wise Correlation via an Efficient Transformer for Vehicle Re-Identification
Ming Li, Jun Liu, Ce Zheng, Xinming Huang, and Ziming Zhang. IEEE TMM. 2021 (ESI Highly Cited Paper).
[Paper]

Figure Self-supervised Geometric Features Discovery with Interpretable Attention for Vehicle Re-Identification and Beyond
Ming Li, Xinming Huang, and Ziming Zhang. ICCV. 2021.
[Paper][Code]

Figure TreeRNN: Topology-Preserving Deep Graph Embedding and Learning
Yecheng Lyu, Ming Li, Xinming Huang, Ulkuhan Guler, Patrick Schaumont, and Ziming Zhang. ICPR. 2020.
[Paper][Code]

Figure RNN Training along Locally Optimal Trajectories via Frank-Wolfe Algorithm
Yun Yue, Ming Li, Venkatesh Saligramay, Ziming Zhang. ICPR. 2020.
[Paper][Code]

Figure LodoNet: A Deep Neural Network with 2D Keypoint Matching for 3D LiDAR Odometry Estimation
Ce Zheng, Yecheng Lyu, Ming Li, Ziming Zhang. ACM MM. 2020.
[Paper]