Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multi-discipline Reasoning on MMMU

69.1Accuracy

GPT-4o

33.292842.588951.88561.1811Dec 5, 2024Feb 22, 2025May 13, 2025Jul 31, 2025Oct 19, 2025Jan 6, 2026Mar 27, 2026
Updated 19d ago

Evaluation Results

MethodLinks
2026.03
69.1
2026.03
65.6
2026.03
56
2026.01
55.44
2026.01
53.67
2026.03
52.1
2026.03
52
2026.03
51.2
2026.01
49.33
2026.03
48.8
2026.01
48.68
2026.03
37.2
2026.03
37
2026.02
36.8
2024.12
36.4
2024.12
36.4
2024.12
36.4
2024.12
36.4
2026.03
36.33
2026.03
36.1
2026.03
36.1
2026.03
36.1
2026.02
36.1
2024.12
36.1
2026.03
36
2026.01
35.89
2026.03
35.89
2026.03
35.6
2024.12
35.4
2026.02
35.3
2024.12
35.3
2026.03
35.22
2026.03
34.8
2026.01
34.67