Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multitask Language Understanding on MMLU (MA, MI, Error Rate)

73Mean Accuracy (MA)

PROBELLM

32.4442.9753.564.03Feb 13, 2026
Updated 4d ago

Evaluation Results

MethodLinks
2026.02
732736
2026.02
643670
2026.02
633738
2026.02
623873
2026.02
594186
2026.02
544665
2026.02
514936
2026.02
514972
2026.02
406066
2026.02
376370
2026.02
366468
2026.02
346647