Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Perception-R1: Pioneering Perception Policy with Reinforcement Learning

About

Inspired by the success of DeepSeek-R1, we explore the potential of rule-based reinforcement learning (RL) in MLLM post-training for perception policy learning. While promising, our initial experiments reveal that incorporating a thinking process through RL does not consistently lead to performance gains across all visual perception tasks. This leads us to delve into the essential role of RL in the context of visual perception. In this work, we return to the fundamentals and explore the effects of RL on different perception tasks. We observe that the perceptual complexity is a major factor in determining the effectiveness of RL. We also observe that reward design plays a crucial role in further approching the upper limit of model perception. To leverage these findings, we propose Perception-R1, a scalable RL framework using GRPO during MLLM post-training. With a standard Qwen2.5-VL-3B-Instruct, Perception-R1 achieves +4.2% on RefCOCO+, +17.9% on PixMo-Count, +4.2% on PageOCR, and notably, 31.9% AP on COCO2017 val for the first time, establishing a strong baseline for perception policy learning.

En Yu, Kangheng Lin, Liang Zhao, Jisheng Yin, Yana Wei, Yuang Peng, Haoran Wei, Jianjian Sun, Chunrui Han, Zheng Ge, Xiangyu Zhang, Daxin Jiang, Jingyu Wang, Wenbing Tao• 2025

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO (val)
mAP31.9
637
Visual Mathematical ReasoningMathVision
Accuracy22.03
254
Vision-centric ReasoningRealworldQA
Accuracy55.8
66
Visual Perception and ReasoningBLINK
Accuracy46.44
64
Visual SearchV*--
28
Visual MathChartQA
Accuracy81.6
25
Perception-intensive ReasoningMMStar
Accuracy54.8
19
Perception-intensive ReasoningMME RealWorld
Accuracy37.5
19
Math & ChartMathVista
Accuracy58.1
19
Referring Expression ComprehensionKVG-Bench
Accuracy (Air, Seen Categories)23.03
17
Showing 10 of 20 rows

Other info

Follow for update