| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Mathematical Reasoning | OMNIMATH random subset of 128 samples | Top-1 Accuracy10.93 | 12 | |
| Mathematical Reasoning | OmniMath (test) | Top-1 Accuracy0.446 | 8 | |
| Mathematical Reasoning | OmniMath (train) | Training Dataset (%)50.1 | 3 |