Marco-o1: Towards Open Reasoning Models for Open-Ended Solutions

About

Currently OpenAI o1 sparks a surge of interest in the study of large reasoning models (LRM). Building on this momentum, Marco-o1 not only focuses on disciplines with standard answers, such as mathematics, physics, and coding -- which are well-suited for reinforcement learning (RL) -- but also places greater emphasis on open-ended resolutions. We aim to address the question: ''Can the o1 model effectively generalize to broader domains where clear standards are absent and rewards are challenging to quantify?'' Marco-o1 is powered by Chain-of-Thought (CoT) fine-tuning, Monte Carlo Tree Search (MCTS), reflection mechanisms, and innovative reasoning strategies -- optimized for complex real-world problem-solving tasks.

Yu Zhao, Huifeng Yin, Bo Zeng, Hao Wang, Tianqi Shi, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang• 2024

Related benchmarks

Task	Dataset	Result
Mathematical Reasoning	AIME 2024	Accuracy9.22	370
Mathematical Reasoning	AIME 2025	Accuracy6.95	227
Mathematical Reasoning	AMC 23	Accuracy45.12	198
Mathematical Reasoning	MATH 500	MATH 500 Accuracy70.48	106
Mathematical Reasoning	Beyond AIME	Accuracy2.77	45
English-to-Chinese translation	WMT24	GRF81.88	21
English-to-Chinese translation	WMT 23	GRF85.34	21
English-to-Chinese translation	Flores-200	GRF86.84	21
English-to-Chinese translation	Our dataset	GRF Score75.09	21
Machine Translation	WMT 24++	AR Score53.5	14

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord