Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
General Multimodal Evaluation on MMMU (val)
Loading...
69.1
Accuracy
GPT-4o
51.94
56.395
60.85
65.305
Jul 9, 2025
Accuracy
Updated 6d ago
Evaluation Results
Method
Method
Links
Accuracy
GPT-4o
2025.07
69.1
D2Dpar
Reasoning Strategy=D2D...
2025.07
67.6
D2Dloc
Reasoning Strategy=D2D...
2025.07
66
Qwen2.5-VL-7B w/ GRPO
Reasoning Mode=Deliber...
2025.07
64.7
D2Djus
Reasoning Strategy=D2D...
2025.07
64.4
GPT-4V
2025.07
63.1
D2Ipar
Reasoning Strategy=D2I...
2025.07
61.6
D2Ijus
Reasoning Strategy=D2I...
2025.07
61.4
D2Iloc
Reasoning Strategy=D2I...
2025.07
61.1
Qwen2.5-VL-7B*
Backbone=Qwen2.5-VL-7B
2025.07
59.3
Qwen2.5-VL-7B w/ GRPO†
Reasoning Mode=Intuiti...
2025.07
59.1
InternVL2.5-8B
2025.07
56
Qwen2-VL-7B
2025.07
54.1
InternVL2-8B
2025.07
52.6
Feedback
Search any
task
Search any
task