| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| GSM8K (test) | Accuracy94.5 | 192 | 9d ago | ||
| GSM8K | DCCD | Accuracy95.15 | 187 | 4d ago | |
| GSM8K | S-GRPO | Accuracy93.8 | 126 | 1mo ago | |
| MATH | MoCAN | Accuracy75.5 | 121 | 4d ago | |
| AMC | LLM-J | Accuracy80 | 95 | 18d ago | |
| Gaokao En 2023 | Legislator-Executor (Ours) | Accuracy79 | 91 | 4d ago | |
| JEEBench | APRM | Accuracy74.4 | 82 | 1mo ago | |
| AQuA | CLoT | Accuracy91.8 | 78 | 8d ago | |
| AMC23 | ReBalance | Pass@1 Accuracy100 | 68 | 10d ago | |
| GSM Hard | IPOMP | Accuracy82.6 | 67 | 4d ago | |
| MultiArith | POES | Accuracy98.3 | 65 | 4d ago | |
| MATH 500 | DEPO | Accuracy94.4 | 60 | 10d ago | |
| MATH500 | UMAD | Pass@1 Rate87.2 | 58 | 1mo ago | |
| OlympiadBench | APRM | Accuracy90.7 | 54 | 1mo ago | |
| AMC23 | UMAD | Pass@187.5 | 51 | 16d ago | |
| MATH500 | ReST-MCTS | Accuracy93.2 | 51 | 1mo ago | |
| GSM8K | Accuracy (GSM8K)100 | 49 | 9d ago | ||
| MSVAMP (test) | Language Steering | Average Accuracy83.9 | 45 | 1mo ago | |
| OlympiadBench | DECS | Accuracy70.3 | 44 | 10d ago | |
| SVAMP | GPT-4o | Accuracy94.2 | 40 | 8d ago | |
| AIME 2024 | ExOPD | Accuracy0.627 | 37 | 1mo ago | |
| GSM8K | ReBalance | Pass@1 Accuracy96.8 | 36 | 1mo ago | |
| AIME 24 | Pass@1 Score73.3 | 36 | 1mo ago | ||
| MATH 500 | ReBalance | Pass@195.2 | 36 | 1mo ago | |
| OlympiadB | APRM | Accuracy90.7 | 36 | 1mo ago |