| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Mathematical Reasoning | AMO-Bench | Avg@50.646 | 48 | |
| Mathematical Reasoning | AMO-Bench | Mean@64 Accuracy11.8 | 27 | |
| Mathematical Reasoning | AMO-Bench | Seed (Avg@5)0.56 | 16 | |
| Mathematical Reasoning | AMO-Bench | Average@1614.8 | 12 | |
| Mathematical Reasoning | AMO-Bench | Pass@836.72 | 6 | |
| Mathematical Reasoning | AMO-Bench VeRA-H / VeRA-H Pro | Avg@5 Accuracy (Seeds)31.75 | 1 |