Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reasoning on MMMU (val)

78.2Accuracy

OpenAI-o1

48.97656.56364.1571.737Jul 31, 2024Nov 15, 2024Mar 3, 2025Jun 19, 2025Oct 4, 2025Jan 20, 2026May 8, 2026
Updated 23d ago

Evaluation Results

MethodLinks
2025.06
78.2
2025.12
75
2025.12
74.1
2025.12
72.6
2025.12
71.4
2025.12
71.2
2025.12
71
2025.06
71
2025.12
70.8
2025.12
70.2
2025.06
70.2
2024.12
70.1
2025.12
69.9
2025.12
69.6
2024.07
69.1
69.1
2025.06
69.1
2024.07
68.3
68.3
2025.12
68.1
2025.12
67.4
2026.05
67.4
2025.12
66.7
2026.05
66.6
2026.05
66.2
2025.12
64.6
2024.07
64.5
2024.12
64.5
2024.12
63.9
2026.05
63.4
2024.12
63.1
2024.12
62.7
2025.12
62.5
2024.07
62.2
2025.12
62.2
62.2
2025.12
61.6
2025.12
61.3
2025.12
61.1
2025.06
60.8
2024.07
60.6
2024.12
60
2024.12
59.7
2026.04
58.6
2025.06
58.4
2025.06
58
2025.12
57.4
2026.04
57.4
2026.04
57.2
2025.12
57.1
2026.01
57
2026.01
56.89
2024.12
56.8
2025.12
56.7
2025.06
56.7
2024.07
56.4
2025.12
56
2024.12
56
2025.06
56
2025.12
55.7
2026.01
55.44
2025.12
55.4
2026.04
55.3
2026.01
55.22
2025.12
55.2
2024.12
55.2
2025.06
55.2
2025.06
55.2
2024.12
55.1
2024.12
55
2025.12
54.7
2025.06
54.7
2026.04
54.7
2026.04
54.6
2025.12
54.3
2025.12
54.1
2024.12
54.1
2024.12
54.1
2026.04
53
2026.04
53
2025.06
52.9
2025.05
52.78
2024.12
52.6
2026.02
52.56
2025.12
52.5
2025.12
52.3
2024.12
52.3
2025.06
52.3
2026.01
52.22
2025.12
52.2
2026.02
51.89
2026.02
51.67
2026.04
51.6
2026.01
51.44
2026.04
51.3
2025.12
51.2
2024.12
51.2
2026.02
50.33
2026.02
50.11
2026.04
50.1
Showing 100 of 168 rows