Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Language Understanding on MMLU o=1 (Semantic-level)

76.9Accuracy

W/O Decontamination

28.22840.86453.566.136Jan 27, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.01
76.90.098
2026.01
76.60.095
2026.01
73.50.17
2026.01
72.80.163
2026.01
72.50.054
2026.01
68.30.118
2026.01
68.20.117
2026.01
67.1-
2026.01
63.20.039
2026.01
62.30.189
2026.01
620.051
2026.01
60.10.167
2026.01
58.80.154
2026.01
56.60.001
2026.01
56.5-
2026.01
52.50.041
2026.01
46.70.033
2026.01
43.4-
2026.01
41.70.254
2026.01
40.50.029
2026.01
30.10.133