Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MMLU

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-task Language UnderstandingMMLU
Accuracy99.7
842
Language UnderstandingMMLU
Accuracy96.6
756
Multitask Language UnderstandingMMLU (test)
Accuracy92.16
303
Multitask Language UnderstandingMMLU
Accuracy89.8
206
General KnowledgeMMLU
MMLU General Knowledge Accuracy91.1
170
Language UnderstandingMMLU 5-shot (test)
Accuracy74.2
149
Multiple-choice Question AnsweringMMLU
Accuracy97.5
148
Language UnderstandingMMLU (test)
MMLU Average Accuracy88
136
Language UnderstandingMMLU 5-shot
Accuracy90.58
132
General ReasoningMMLU
MMLU Accuracy95.1
126
Multiple Choice Question AnsweringMMLU-Pro
MMLU-Pro Overall Accuracy84.8
116
Language UnderstandingMMLU 0-shot
Accuracy70.46
110
Multi-task Language UnderstandingMMLU
Accuracy73
101
Multitask Language UnderstandingMMLU-Pro
Accuracy87.1
99
Multi-task Language UnderstandingMMLU
Accuracy74.4
87
Multi-task Language UnderstandingMMLU (test)
Normalized Accuracy90.46
76
KnowledgeMMLU
Accuracy85.93
71
Language UnderstandingMMLU-Pro
Accuracy80.6
70
Question AnsweringMMLU
Accuracy88.7
62
Multitask Language UnderstandingMMLU (val)
Accuracy63.16
58
Question AnsweringMMLU-Pro Natural Setting (test)
Accuracy87.8
56
Question AnsweringMMLU-Pro
Accuracy89.1
56
Question AnsweringMMLU
Test Error Probability0.141
52
General ReasoningMMLU-Pro
MMLU-Pro General Reasoning Avg@8 Acc90.1
51
ReasoningMMLU-Pro
Accuracy90.1
50
Showing 25 of 369 rows
...