Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-task Language Understanding on MMLU (MMLU Score)

86.4MMLU Score

GPT-4

23.937640.153856.3772.5862May 16, 2025Jul 17, 2025Sep 18, 2025Nov 19, 2025Jan 21, 2026Mar 24, 2026May 26, 2026
Updated 6d ago

Evaluation Results

MethodLinks
2026.01
86.4
2025.05
75
2025.05
74
2025.05
73
2025.05
72
2025.10
71.2
2025.05
71
2026.01
70
2025.05
70
2025.10
69.9
2025.05
69.5
2025.05
69
2025.05
68.5
2025.05
68
2025.10
67
2026.02
66.2
2026.03
65.44
2025.05
65.29
2026.03
65.12
2026.03
64.62
2026.03
64.45
2026.03
64.45
2026.02
63.7
2026.03
63.25
2026.03
63.15
2026.03
63.09
2026.02
63
2026.03
62.79
2026.02
62.7
2026.02
62.6
2026.02
62.3
2026.03
62.13
2026.03
61.64
2026.03
61.48
2026.05
61.1
2026.03
60.25
2026.03
59.55
2026.03
59.52
2026.03
58.49
2026.03
57.81
2026.03
56.44
2025.05
56.37
2025.05
56.2
2025.05
56.2
2026.03
56.13
2026.03
55.87
2026.03
54.06
2026.03
53.82
2026.03
53.8
2026.03
53.48
2026.02
52.9
2025.05
52.67
2026.02
52.3
2026.01
52.1
2026.03
51.84
2026.02
51.7
2026.05
51.67
2026.03
51.45
2026.02
51.4
2026.03
51.24
2026.02
51
2026.02
50.8
2026.05
50.8
2026.03
50.24
2026.01
48.1
2026.02
47.4
2026.01
47
2026.03
46.85
2026.03
46.72
2026.03
46.36
2026.05
45.8
2026.03
45.7
2026.03
43.56
2026.02
43.5
2026.03
41.83
2026.03
41.48
2026.03
41.33
2026.03
40.29
2026.03
40.21
2026.02
39.5
2026.03
39.4
2026.03
38.33
2026.03
38.03
2026.02
37.7
2026.03
37.03
2026.03
36.76
2026.03
35.25
2026.03
35.22
2026.02
34.5
2026.02
33.6
2026.03
32.77
2026.03
32.25
2026.03
31.23
2026.03
29.63
2026.03
29.11
2026.03
29.01
2026.03
27.23
2026.03
27.1
2026.03
27.1
2026.03
26.34
Showing 100 of 116 rows