| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Arithmetic Reasoning | SingleEq | Accuracy98.8 | 43 | |
| Math reasoning | Singleeq | EM0.8307 | 10 | |
| Mathematical Reasoning | SingleEQ (test) | Accuracy99.01 | 4 | |
| Mathematical Reasoning | SINGLEEQ | Solve Rate96.1 | 4 | |
| Arithmetic Reasoning | SingleEq (test) | Accuracy0.795 | 4 | |
| Online Out-of-Distribution Detection | SingleEq Near-shift OOD | Accuracy93.15 | 3 |