Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-discipline reasoning on MMMU (val) (Accuracy)

81.8Accuracy

GPT-5

28.44842.29956.1570.001Feb 12, 2026Feb 19, 2026Feb 27, 2026Mar 7, 2026Mar 14, 2026Mar 22, 2026Mar 30, 2026
Updated 18d ago

Evaluation Results

MethodLinks
2026.03
81.8
2026.03
79.6
2026.03
79
2026.03
78.7
2026.03
77.8
2026.03
73.4
2026.03
71.4
2026.03
69.6
2026.03
68
2026.03
67.7
2026.03
67.4
2026.03
66.6
2026.03
55.8
2026.03
53.7
2026.03
53
2026.03
50.9
2026.03
46.1
2026.03
45.8
2026.03
41.2
2026.02
37.2
2026.02
36.5
2026.02
34.1
2026.02
32.9
2026.02
31.7
2026.02
31.1
2026.02
30.5