Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Massive Multitask Language Understanding on MMLU (Performance Profile)

56.6MMLU

Qwen3-4B + FBS-Full (ours)

44.01647.28350.5553.817Jan 29, 2026Feb 11, 2026Feb 25, 2026Mar 10, 2026Mar 24, 2026Apr 6, 2026Apr 20, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2026.01
56.65320.736
2026.01
56.47551.030
2026.01
55.176010
2026.01
555550.7430
2026.01
555950.8215
2026.01
54.96460.922
2026.01
54.75700.818
2026.04
46.1---
2026.04
45.7---
2026.04
45.3---
2026.04
45.2---
2026.04
45.1---
2026.04
45---
2026.04
44.9---
2026.04
44.8---
2026.04
44.5---