| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| MATH | DeepSeek-R1 | Accuracy97.6 | 229 | 1mo ago | |
| AIME 2024 | Accuracy100 | 62 | 1mo ago | ||
| Gaokao MathQA | Qwen2.5-Math-72B | Accuracy86.3 | 60 | 25d ago | |
| AIME 25 | Accuracy93.3 | 54 | 1mo ago | ||
| AIME | MSV 64 | AIME Score1,289.8 | 52 | 1mo ago | |
| MATH | Accuracy95.2 | 32 | 8d ago | ||
| Code2Math | Accuracy Ratio98 | 30 | 1mo ago | ||
| MinervaMath (test) | PASER | Accuracy21.2 | 28 | 1mo ago | |
| MathVerse (testmini) | Accuracy64.9 | 28 | 1mo ago | ||
| AIME 2025 | ePF w/ LaM | Top-1 Accuracy (%)26.97 | 26 | 18d ago | |
| AIME 2024 | ePF w/ LaM | Top-1 Accuracy29.13 | 26 | 18d ago | |
| MATH-Vision (test) | Accuracy68.8 | 26 | 1mo ago | ||
| MATH (test) | Gemini-Ultra | Accuracy53.2 | 25 | 1mo ago | |
| MATH500 | Granite-3.3 | MATH500 Score85.88 | 21 | 1mo ago | |
| AIME 2024 | Qwen3-8B | Pass@176 | 21 | 17d ago | |
| MATH eval (test) | TROLL | Success Rate59.1 | 20 | 1mo ago | |
| MATH | DeepSeekMath-Base | Overall Accuracy0.362 | 20 | 1mo ago | |
| MATH | Accuracy @ t147.4 | 18 | 1mo ago | ||
| Gaokao MathCloze | Accuracy72.9 | 18 | 1mo ago | ||
| Omni-MATH | MSV 64 | Best-of-N Accuracy35.4 | 17 | 1mo ago | |
| AIME | MSV 64 | Best-of-N Accuracy50.98 | 17 | 1mo ago | |
| AMC12 | MSV 64 | Best-of-N Accuracy58.66 | 17 | 1mo ago | |
| OlympiadBench | MSV 64 | Best-of-N Accuracy53.66 | 17 | 1mo ago | |
| MATH | MSV 64 | Best-of-N Accuracy76.07 | 17 | 1mo ago | |
| Omni-MATH | MSV 64 | AUTC1,046.36 | 17 | 1mo ago |