Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions
About
Currently OpenAI o1 sparks a surge of interest in the study of large reasoning models (LRM). Building on this momentum, Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding -- which are well-suited for reinforcement learning (RL) -- but also places greater emphasis on open-ended resolutions. We aim to address the question: ''Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?'' Marco-o1 is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies -- optimized for complex real-world problem-solving tasks.
Yu Zhao, Huifeng Yin, Bo Zeng, Hao Wang, Tianqi Shi, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Mathematical Reasoning | AIME 2024 | Accuracy9.22 | 251 | |
| Mathematical Reasoning | AIME 2025 | Accuracy6.95 | 227 | |
| Mathematical Reasoning | AMC 23 | Accuracy45.12 | 198 | |
| Mathematical Reasoning | MATH 500 | MATH 500 Accuracy70.48 | 106 | |
| Mathematical Reasoning | Beyond AIME | Accuracy2.77 | 32 |
Showing 5 of 5 rows