Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-discipline reasoning on MMMU (val) (Accuracy)
Loading...
37.2
Accuracy
Qwen2.5-VL-72B Instruct
30.232
32.041
33.85
35.659
Feb 12, 2026
Accuracy
Updated 3d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen2.5-VL-72B Instruct
Zero-shot=true
2026.02
37.2
Qwen2.5-VL-32B + AT-RL (Ours)
Zero-shot=true
2026.02
36.5
Qwen2.5-VL-32B + VPPO
Zero-shot=true
2026.02
34.1
Gemini 2.0 Flash
Zero-shot=true
2026.02
32.9
Qwen2.5-VL-32B Instruct
Zero-shot=true
2026.02
31.7
OpenAI GPT-4o
Zero-shot=true
2026.02
31.1
Claude 3.5 Sonnet
Zero-shot=true
2026.02
30.5
Feedback
Search any
task
Search any
task