Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MM-Eureka: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

About

DeepSeek R1, and o1 have demonstrated powerful reasoning capabilities in the text domain through stable large-scale reinforcement learning. To enable broader applications, some works have attempted to transfer these capabilities to multimodal reasoning. However, these efforts have been limited by the limited difficulty of selected tasks and relatively small training scales, making it challenging to demonstrate strong multimodal reasoning abilities. To address this gap, we introduce the MMK12 dataset and MM-EUREKA with 7B and 32B parameters. The former is a high-quality multimodal mathematics reasoning dataset featuring diverse knowledge domains with human-verified answers and solution processes. The latter is a multimodal model employing rule-based reinforcement learning on MMK12, utilizing online filtering and two-stage training strategy to enhance training stability. MM-EUREKA demonstrates remarkable performance gains in multimodal mathematical reasoning, outperforming previous powerful models like InternVL2.5-78B or InternVL2.5-38B-MPO. In particular, MM-EUREKA achieves competitive or superior performance compared to both open-source and closed-source models, and trails slightly behind o1 in multidisciplinary reasoning tasks. We open-source our complete pipeline to foster further research in this area. We release all our codes, models, data, etc. at https://github.com/ModalMinds/MM-EUREKA

Fanqing Meng, Lingxiao Du, Zongkai Liu, Zhixiang Zhou, Quanfeng Lu, Daocheng Fu, Tiancheng Han, Botian Shi, Wenhai Wang, Junjun He, Kaipeng Zhang, Ping Luo, Yu Qiao, Qiaosheng Zhang, Wenqi Shao• 2025

Related benchmarks

TaskDatasetResultRank
Visual Mathematical ReasoningMathVista
Accuracy70.6
189
Multimodal ReasoningMMMU (val)
Accuracy55.2
114
Multimodal ReasoningMMStar
Accuracy64.3
81
Visual Mathematical ReasoningMathVerse
Accuracy49.6
73
Visual Mathematical ReasoningMathVision
Accuracy27.4
63
Visual Mathematical ReasoningWeMath
Accuracy67.4
53
Mathematical Multimodal ReasoningMathVista
Accuracy73
46
Multimodal Mathematical ReasoningMathVista mini (test)
Overall Accuracy73
33
Multi-modal ReasoningMathVision (test)
Accuracy (%)26.9
32
Mathematical ReasoningGeoQA (test)
Accuracy62.86
31
Showing 10 of 49 rows

Other info

Follow for update