Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-discipline reasoning on MMMU (val) (Accuracy)

81.8Accuracy

GPT-5

28.44842.29956.1570.001Feb 12, 2026Mar 1, 2026Mar 19, 2026Apr 6, 2026Apr 23, 2026May 11, 2026May 29, 2026
Updated 2d ago

Evaluation Results

MethodLinks
2026.03
81.8
2026.03
79.6
2026.03
79
2026.03
78.7
2026.03
77.8
2026.03
73.4
2026.03
71.4
2026.03
69.6
2026.03
68
2026.03
67.7
2026.05
67.7
2026.05
67.7
2026.03
67.4
2026.03
66.6
2026.05
59.8
2026.05
58
2026.05
56.8
2026.05
56
2026.03
55.8
2026.05
55.2
2026.05
54.9
2026.05
54.4
2026.03
53.7
2026.03
53
2026.05
52.4
2026.05
51
2026.03
50.9
2026.05
49.6
2026.03
46.1
2026.03
45.8
2026.03
41.2
2026.02
37.2
2026.02
36.5
2026.02
34.1
2026.02
32.9
2026.02
31.7
2026.02
31.1
2026.02
30.5