Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-Image Understanding on MuirBench 142 (test)
Loading...
86.1
Score
Gemini 3 Pro
20.996
37.898
54.8
71.702
Jan 15, 2026
Score
Updated 3d ago
Evaluation Results
Method
Method
Links
Score
Gemini 3 Pro
Model Category=API cal...
2026.01
86.1
GPT-5
Model Category=API cal...
2026.01
78.6
GLM-4.1V-9B
Model Category=Open we...
2026.01
74.7
Gemini 2.5 Pro
Model Category=API cal...
2026.01
74.5
Gemini 2.5 Flash
Model Category=API cal...
2026.01
73.5
GPT-5 mini
Model Category=API cal...
2026.01
71.4
Qwen3-VL-8B
Model Category=Open we...
2026.01
64.4
Qwen3-VL-4B
Model Category=Open we...
2026.01
63.8
Molmo2-8B
Model Category=Molmo2...
2026.01
63.7
Eagle2.5-8B
Model Category=Open we...
2026.01
61.8
Molmo2-4B
Model Category=Molmo2...
2026.01
60.5
Claude Sonnet 4.5
Model Category=API cal...
2026.01
59.6
Molmo2-O-7B
Model Category=Molmo2...
2026.01
58.4
InternVL3.5-8B
Model Category=Open we...
2026.01
55.8
MiniCPM-V-4.5-8B
Model Category=Open we...
2026.01
53.3
InternVL3.5-4B
Model Category=Open we...
2026.01
53.1
Keye-VL-1.5-8B
Model Category=Open we...
2026.01
51.2
PLM-3B
Model Category=Open mo...
2026.01
25.7
PLM-8B
Model Category=Open mo...
2026.01
23.5
Feedback
Search any
task
Search any
task