AIMO-2 Winning Solution: Building State-of-the-Art Mathematical Reasoning Models with OpenMathReasoning dataset

About

This paper presents our winning submission to the AI Mathematical Olympiad - Progress Prize 2 (AIMO-2) competition. Our recipe for building state-of-the-art mathematical reasoning models relies on three key pillars. First, we create a large-scale dataset comprising 540K unique high-quality math problems, including olympiad-level problems, and their 3.2M long-reasoning solutions. Second, we develop a novel method to integrate code execution with long reasoning models through iterative training, generation, and quality filtering, resulting in 1.7M high-quality Tool-Integrated Reasoning solutions. Third, we create a pipeline to train models to select the most promising solution from many candidates. We show that such generative solution selection (GenSelect) can significantly improve upon majority voting baseline. Combining these ideas, we train a series of models that achieve state-of-the-art results on mathematical reasoning benchmarks. To facilitate further research, we release our code, models, and the complete OpenMathReasoning dataset under a commercially permissive license.

Ivan Moshkov, Darragh Hanley, Ivan Sorokin, Shubham Toshniwal, Christof Henkel, Benedikt Schifferer, Wei Du, Igor Gitman• 2025

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	MATH 500	Accuracy (Acc)92.9	600
Mathematical Reasoning	AIME 2024	Accuracy87.9	370
Mathematical Reasoning	AIME 24	Accuracy64.06	358
Mathematical Reasoning	Minerva	Pass@1 Accuracy22.3	289
Mathematical Reasoning	MATH 500	pass@195.55	239
Mathematical Reasoning	AIME 2025	Accuracy86.1	227
Mathematical Reasoning	AMC23	PASS@1 Accuracy82.7	216
Mathematical Reasoning	OlympiadBench	Accuracy74.09	213
Mathematical Reasoning	AIME 25	Pass@1 Accuracy50.1	190
Mathematical Reasoning	AIME 24	Pass@1 Accuracy44	153

Showing 10 of 45 rows

Other info

Follow for update

@wizwand_team Discord