Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multitask Language Understanding on MMLU Pro (pass@1)

86.7pass@1

Qwen 3.5

-2.7524820.4707643.69466.91724Aug 27, 2024Dec 8, 2024Mar 22, 2025Jul 4, 2025Oct 15, 2025Jan 27, 2026May 11, 2026
Updated 21d ago

Evaluation Results

MethodLinks
2026.05
86.7
2026.05
83.73
2026.05
80.62
2026.05
76.81
2026.05
72.17
2024.08
48.94
2024.08
47.48
2024.08
47.48
2024.08
47.21
2024.08
47.18
2024.08
46.89
2024.08
44.83
2024.08
34.42
2024.08
34.25
2024.08
33.05
2024.08
33.05
2024.08
33.01
2024.08
32.86
2024.08
32.73
2024.08
30.73
2024.08
30.43
2024.08
30.41
2024.08
30.41
2024.08
30.11
2024.08
30.09
2024.08
29.72
2024.08
28.72
2024.08
28.49
2024.08
28.03
2024.08
28.03
2024.08
27.98
2024.08
27.95
2024.08
27.92
2026.01
0.793
2026.01
0.793
2026.01
0.791
2026.01
0.789
2026.01
0.688