Kaituo Feng

I am a first-year PhD student at Multimedia Lab (MMLab) in the Chinese University of Hong Kong, working with Prof. Xiangyu Yue. Previously, I was a master student at Beijing Institute of Technology (BIT), advised by Prof. Changsheng Li (2022-2025). I also received my Bachelor's degree in Computer Science from BIT (2018 - 2022). I have published several papers in top conferences or journals, such as ICML, ICLR, CVPR, KDD, IEEE TIP, IEEE TKDE, IEEE TPAMI, etc.

My research interests include MLLMs and AIGC. Welcome for discussion and collaboration, feel free to drop me an email.

Email: kaituofeng@gmail.com

[Google Scholar] [Github]


Selected Publications


Video-R1: Reinforcing Video Reasoning in MLLMs

arXiv 2025

Kaituo Feng, Kaixiong Gong, Bohao Li, Zonghao Guo, Yibing Wang, Tianshuo Peng, Junfei Wu, Xiaoying Zhang, Benyou Wang, Xiangyu Yue

Explore the R1 paradigm for eliciting video reasoning within MLLMs.

Paper Code
Reinforcing Spatial Reasoning in Vision-Language Models with Interwoven Thinking and Visual Drawing

arXiv 2025

Junfei Wu, Jian Guan, Kaituo Feng, Qiang Liu, Shu Wu, Liang Wang, Wei Wu, Tieniu Tan

Achieveing o3-like thinking for spatial reasoning across images and videos.

Paper Code
SophiaVL-R1: Reinforcing MLLMs Reasoning with Thinking Reward

arXiv 2025

Kaixuan Fan*, Kaituo Feng*, Haoming Lyu, Dongzhan Zhou, Xiangyu Yue (*equal contribution)

Intergrating thinking-level reward to address the phenomenon of "wrong thinking, correct answer".

Paper Code
Critique-GRPO: Advancing LLM Reasoning with Natural Language and Numerical Feedback

arXiv 2025

Xiaoying Zhang, Hao Sun, Yipeng Zhang, Kaituo Feng, Chaochao Lu, Chao Yang, Helen Meng

Using external critiques as language feedback for improving reasoning

Paper Code
AV-Odyssey: Can Your Multimodal LLMs Really Understand Audio-Visual Information?

arXiv 2024

Kaixiong Gong*, Kaituo Feng*, Bohao Li*, Yibing Wang, Mofan Cheng, Shijia Yang, Jiaming Han, Benyou Wang, Yutong Bai, Zhuoran Yang, Xiangyu Yue (*equal contribution)

We propose a comprehensive benchmark for evaluating audio-visual understanding abilities of MLLMs.

Paper Code
On the Road to Portability: Compressing End-to-End Motion Planner for Autonomous Driving

CVPR 2024

Kaituo Feng, Changsheng Li, Dongchun Ren, Ye Yuan, Guoren Wang

We constitute the first attempt to explore a knowledge distillation method to compress end-to-end autonomous driving planners.

Paper Code
Keypoint-based Progressive Chain-of-Thought Distillation for LLMs

ICML 2024

Kaituo Feng, Changsheng Li, Xiaolu Zhang, Jun Zhou, Ye Yuan, Guoren Wang

We propose a new compression method to progressively distill the emergent reasoning capabilities of LLMs into smaller models, as well as encouraging the precise mimicry of significant tokens.

Paper
Towards Open Temporal Graph Neural Networks

ICLR 2023, Oral, 90/4922

Kaituo Feng, Changsheng Li, Xiaolu Zhang, Jun Zhou

We propose the first class-incremental learning for temporal GNNs, allowing temporal graphs to evolve in the real-world scenarios with an open class set

Paper Code
Shared Growth of Graph Neural Networks via Prompted Free-Direction Knowledge Distillation

IEEE TPAMI, KDD 2022

Kaituo Feng, Yikun Miao, Changsheng Li, Ye Yuan, Guoren Wang

We utilize reinforcement learning to exchange beneficial knowledge between two GNNs

Paper


Selected Honors and Awards

  • National Scholarship, Ministry of Education of China (TOP 2%), 2024.

  • National Scholarship, Ministry of Education of China (TOP 2%), 2023.

  • Outstanding Undergraduate Student of Beijing Institute of Technology, 2022.

  • Silver Medal of 45th ACM-ICPC Asia Regional Contest, 2020.

  • First Prize (top 1%) of China Undergraduate Mathematical Contest in Modeling (CUMCM), 2020.

  • Gold Medal of Group Programming Ladder Tournament China Finals, 2020.


Contact


  • Email: kaituofeng@gmail.com