Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-modal Question Answering on MMMU (val)
Loading...
70.7
Accuracy
Proprietary API SOTA (Hurst et al., 2024)
35.548
44.674
53.8
62.926
Jan 21, 2025
Apr 9, 2025
Jun 26, 2025
Sep 12, 2025
Nov 29, 2025
Feb 15, 2026
May 5, 2026
Accuracy
Updated 28d ago
Evaluation Results
Method
Method
Links
Accuracy
Proprietary API SOTA (Hurst et al., 2024)
Model Type=Proprietary...
2025.01
70.7
Gemini-1.5-flash + UnAC
Backbone=Gemini-1.5-fl...
2026.05
60.9
GPT4-V + UnAC
Backbone=GPT4-V, Promp...
2026.05
60.7
GPT4-V + SKETCHPAD
Backbone=GPT4-V, Promp...
2026.05
59.7
GPT4-V + CCoT
Backbone=GPT4-V, Promp...
2026.05
58.7
GPT4-V
Backbone=GPT4-V, Promp...
2026.05
57.2
GPT4-V + SoM
Backbone=GPT4-V, Promp...
2026.05
57.2
Open-Source SOTA (Chen et al., 2024d)
Model Type=Open-Source...
2025.01
56.2
Gemini-1.5-flash
Backbone=Gemini-1.5-fl...
2026.05
56.1
InternVL2.0-8B + UnAC
Backbone=InternVL2.0-8...
2026.05
54.7
InternVL2.0-8B
Backbone=InternVL2.0-8...
2026.05
51.8
LLaVA-OneVision-7B + UnAC
Backbone=LLaVA-OneVisi...
2026.05
51
LLaVA-OneVision-7B
Backbone=LLaVA-OneVisi...
2026.05
48.8
IXC-2.5-Chat
Model Type=Open-Source...
2025.01
44.1
IXC-2.5
Model Type=Open-Source...
2025.01
42.9
LLaVA-v1.6-7B + UnAC
Backbone=LLaVA-v1.6-7B...
2026.05
37.4
LLaVA-v1.6-7B
Backbone=LLaVA-v1.6-7B...
2026.05
36.9
Feedback
Search any
task
Search any
task