Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Multimodal Reasoning on MMMU (val)

78.2Accuracy

OpenAI-o1

46.2254.522562.82571.1275Jul 31, 2024Nov 7, 2024Feb 14, 2025May 24, 2025Aug 31, 2025Dec 8, 2025Mar 18, 2026
Updated 1mo ago

Evaluation Results

MethodLinks
2025.06
78.2
2025.12
75
2025.12
74.1
2025.12
72.6
2025.12
71.4
2025.12
71.2
2025.12
71
2025.06
71
2025.12
70.8
2025.12
70.2
2025.06
70.2
2024.12
70.1
2025.12
69.9
2025.12
69.6
2024.07
69.1
69.1
2025.06
69.1
2024.07
68.3
68.3
2025.12
68.1
2025.12
67.4
2025.12
66.7
2025.12
64.6
2024.07
64.5
2024.12
64.5
2024.12
63.9
2024.12
63.1
2024.12
62.7
2025.12
62.5
2024.07
62.2
2025.12
62.2
62.2
2025.12
61.6
2025.12
61.3
2025.12
61.1
2025.06
60.8
2024.07
60.6
2024.12
60
2024.12
59.7
2025.06
58.4
2025.06
58
2025.12
57.4
2025.12
57.1
2026.01
57
2026.01
56.89
2024.12
56.8
2025.12
56.7
2025.06
56.7
2024.07
56.4
2025.12
56
2024.12
56
2025.06
56
2025.12
55.7
2026.01
55.44
2025.12
55.4
2026.01
55.22
2025.12
55.2
2024.12
55.2
2025.06
55.2
2025.06
55.2
2024.12
55.1
2024.12
55
2025.12
54.7
2025.06
54.7
2025.12
54.3
2025.12
54.1
2024.12
54.1
2024.12
54.1
2025.06
52.9
2025.05
52.78
2024.12
52.6
2026.02
52.56
2025.12
52.5
2025.12
52.3
2024.12
52.3
2025.06
52.3
2026.01
52.22
2025.12
52.2
2026.02
51.89
2026.02
51.67
2026.01
51.44
2025.12
51.2
2024.12
51.2
2026.02
50.33
2026.02
50.11
49.8
2024.12
49.7
2024.07
49.6
2025.12
49.6
2026.01
49.44
2026.03
49.33
2025.12
49.2
2025.12
48.8
2025.05
48.67
2026.02
48.56
2026.01
48.44
2026.01
48.11
2024.12
47.9
2025.05
47.89
2026.02
47.45
Showing 100 of 144 rows