Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

MGSM

Benchmarks

Task NameDataset NameSOTA ResultTrend
Mathematical ReasoningMGSM
Accuracy93.2
236
Multilingual Mathematical ReasoningMGSM (test)
Accuracy93.4
109
Mathematical ReasoningMGSM (test)
Accuracy (ZH)88
80
Multilingual Mathematical ReasoningMGSM
Accuracy94.9
52
Question AnsweringMGSM
AUC90.79
51
Mathematical ReasoningMGSM
Accuracy (Bn)88.8
49
Hallucination DetectionMGSM
AUC84.62
42
Multilingual Mathematical ReasoningMGSM 1.0 (test)
Accuracy (ru)69.6
35
ReasoningMGSM
Accuracy94.4
29
Multilingual Mathematical ReasoningMGSM Thai (test)
Accuracy41.6
25
Multilingual Math ReasoningMGSM
Mean@384.43
23
Math reasoningmGSM v2
Accuracy (Seen)78
21
Multilingual mathematical reasoningMGSM
Speedup (de)6.08
20
Mathematical ReasoningMGSM Rev2
Random Baseline Score74.8
16
Mathematical ReasoningMGSM non-EU languages (test)
Accuracy91.4
16
Mathematical ReasoningMGSM 24 official EU languages
Accuracy93
14
Mathematical ReasoningMGSM Bangla
Accuracy (Original)0.88
13
Mathematical ReasoningMGSM
Average@1044
12
Mathematical ReasoningMGSM ALL 1.0 (test)
Accuracy75.93
12
Mathematical ReasoningMGSM EN 1.0 (test)
Accuracy94
12
Machine TranslationMGSM
Bn Score72.89
12
Mathematical ReasoningBn-MGSM (test)
Accuracy89.2
12
Mathematical ReasoningMGSM average (test)
Accuracy84.8
12
Mathematical ReasoningMGSM-zh (test)
Accuracy89.6
12
Natural Language UnderstandingMGSM
Accuracy6.2
11
Showing 25 of 40 rows