| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multi-step mathematical reasoning | We-Math (test) | S1 Score72.8 | 20 | |
| Math Reasoning | We-Math | Pass@176.4 | 19 | |
| Mathematical & Geometric Reasoning | We-Math | Accuracy@877.7 | 16 | |
| Mathematical Reasoning | We-Math mini (test) | Accuracy66.1 | 13 |