Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Visual Tool-Use on MME-RealWorld
Loading...
65.8
Accuracy
ARM-Thinker-7B
44.376
49.938
55.5
61.062
Dec 4, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ARM-Thinker-7B
Size=7B, Backbone=Qwen...
2025.12
65.8
Mini-o3
Source=Lai et al. (2025)
2025.12
65.5
Pixel Reasoner
Source=Lai et al. (2025)
2025.12
64.4
DeepEyes
Source=Lai et al. (2025)
2025.12
64
Qwen3-VL-8B
Size=8B
2025.12
63.1
InternVL3.5-8B
Size=8B
2025.12
62.8
InternVL3-8B
Size=8B
2025.12
61.2
Qwen2.5-VL-7B
Size=7B
2025.12
58.5
GPT-4o
Source=Lai et al. (2025)
2025.12
45.2
Feedback
Search any
task
Search any
task