| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Multimodal Reasoning | DynaMath | Accuracy67.2 | 24 | |
| Visual Mathematical Reasoning | DynaMath | Accuracy60.9 | 22 | |
| Mathematical Reasoning | DynaMath DMath | Accuracy56.9 | 18 | |
| Step-wise Verification | DynaMath | Macro F166.7 | 18 | |
| Dynamic mathematical reasoning | DynaMath (test) | Accuracy64.8 | 15 | |
| Multimodal Mathematical Reasoning | DynaMath-W | Accuracy60.5 | 14 | |
| STEM & Puzzle | DynaMath (test) | Accuracy83.4 | 11 | |
| Mathematical Reasoning | DynaMath | Avg@358.7 | 10 | |
| Mathematical Reasoning | DynaMath | Pass@151.01 | 9 | |
| First Incorrect Step Identification | DynaMath | FISI F1 Score26.7 | 6 |