Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU-Pro (Performance and Token Count)

92.86Performance

Agent Q-Mix

78.445682.187885.9389.6722Apr 1, 2026
Updated 16d ago

Evaluation Results

MethodLinks
2026.04
92.86112
2026.04
92.861.25
2026.04
88.571.14
2026.04
87.1497
2026.04
87.141
2026.04
85.711.02
2026.04
84.291.05
2026.04
81.432.17
2026.04
80471
2026.04
792.71