| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Mathematical Reasoning | MGSM | Accuracy91.7 | 114 | |
| Multilingual Mathematical Reasoning | MGSM (test) | Accuracy93.4 | 57 | |
| Multilingual Mathematical Reasoning | MGSM | Accuracy (Bn)64.8 | 36 | |
| Multilingual Mathematical Reasoning | MGSM 1.0 (test) | Accuracy (ru)69.6 | 35 | |
| Mathematical Reasoning | MGSM (test) | Accuracy (MGSM)75.6 | 29 | |
| Reasoning | MGSM | Accuracy90 | 24 | |
| Multilingual mathematical reasoning | MGSM | Speedup (de)6.08 | 20 | |
| Mathematical Reasoning | MGSM non-EU languages (test) | Accuracy91.4 | 16 | |
| Mathematical Reasoning | MGSM 24 official EU languages | Accuracy93 | 14 | |
| Mathematical Reasoning | MGSM Bangla | Accuracy (Original)0.88 | 13 | |
| Mathematical Reasoning | Bn-MGSM (test) | Accuracy89.2 | 12 | |
| Mathematical Reasoning | MGSM average (test) | Accuracy84.8 | 12 | |
| Natural Language Understanding | MGSM | Accuracy6.2 | 11 | |
| Mathematical Reasoning | MGSM-zh (test) | Accuracy79.6 | 10 | |
| Mathematical Reasoning | MGSM Code Switched P, I - (EN), Q(X) (Avg) | Language Consistency100 | 8 | |
| Mathematical Reasoning | MGSM Monolingual P, I, Q - (X) (Avg) | Language Consistency100 | 8 | |
| Multilingual | MGSM | MGSM Score66.56 | 7 | |
| STEM | MGSM Zh | Pass@169.7 | 6 | |
| Multilingual Mathematical Reasoning | MGSM 18 languages | Accuracy72.5 | 6 | |
| Mathematical Reasoning | MGSM Thai | Score87.6 | 5 | |
| Mathematical Reasoning | MGSM Māori | Accuracy41.6 | 4 | |
| Mathematical Reasoning | MGSM 10 non-English languages | Non-Eng AVG Accuracy (Original)76.4 | 3 | |
| Mathematical Reasoning | MGSM Telugu | Accuracy78.8 | 2 | |
| Multilingual Reasoning | MGSM | Accuracy49.9 | 2 |