Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Multi-step Reasoning on MMMU (Accuracy)
Loading...
68.3
Accuracy (MMMU Multi-step Reasoning)
Claude-3.5
5.38
21.715
38.05
54.385
May 13, 2026
Accuracy (MMMU Multi-step Reasoning)
Updated 19d ago
Evaluation Results
Method
Method
Links
Accuracy (MMMU Multi-step Reasoning)
Claude-3.5
Size=-
2026.05
68.3
Qwen2.5-VL-Instruct
Size=72B
2026.05
67
VL-Rethinker
Size=7B
2026.05
56.7
MoCA
Size=7B
2026.05
54.8
Qwen2.5-VL-Instruct
Size=7B
2026.05
54.3
GPT-4o
Size=-
2026.05
51.9
Pixel Reasoner
Size=7B
2026.05
50.8
Llava-OV
Size=7B
2026.05
48.8
DeepEyes
Size=7B
2026.05
45.2
GPT-4o-mini
Size=-
2026.05
45.1
mPLUG-Owl3
Size=7B
2026.05
42.9
Docopilot
Size=8B
2026.05
36.6
R1-VL
Size=7B
2026.05
7.8
Feedback
Search any
task
Search any
task