Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Visual Tool-Use on HRBench 4K
Loading...
80.1
Accuracy
ARM-Thinker-7B
61.276
66.163
71.05
75.937
Dec 4, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ARM-Thinker-7B
Size=7B, Backbone=Qwen...
2025.12
80.1
Mini-o3
Source=Lai et al. (2025)
2025.12
77.5
Qwen3-VL-8B
Size=8B
2025.12
76.8
Pixel Reasoner
Source=Lai et al. (2025)
2025.12
74
DeepEyes
Source=Lai et al. (2025)
2025.12
73.2
InternVL3-8B
Size=8B
2025.12
70.3
InternVL3.5-8B
Size=8B
2025.12
69.9
Qwen2.5-VL-7B
Size=7B
2025.12
69.1
GPT-4o
Source=Lai et al. (2025)
2025.12
62
Feedback
Search any
task
Search any
task