Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU-Pro AceReason (Reduced)

71.1Accuracy

Model-first Greedy

56.22860.08963.9567.811May 21, 2026
Updated 8d ago

Evaluation Results

MethodLinks
2026.05
71.1
2026.05
69.2
2026.05
68.7
2026.05
68.7
2026.05
68.7
2026.05
67.8
2026.05
66.9
2026.05
66.6
2026.05
66.4
2026.05
56.8