Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU-M

28.79Accuracy

ZipCal

20.93822.976525.01527.0535Mar 17, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.03
28.79-
2026.03
27.4-
2026.03
26.29-
2026.03
26.29-
2026.03
26.273.76
2026.03
24.44-
2026.03
24.44-
2026.03
24.44-
2026.03
24.44-
2026.03
24.44-
2026.03
23.33-
2026.03
23.33-
2026.03
23.19-
2026.03
23.11-0.08
2026.03
23.01-
2026.03
23.01-
2026.03
23.01-
2026.03
23.01-
2026.03
22.51-
2026.03
22.02-
2026.03
22.02-
2026.03
21.85-
2026.03
21.85-
2026.03
21.48-
2026.03
21.24-
2026.03
21.24-