Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Visual Tool-Use on HRBench 8K
Loading...
73.7
Accuracy
ARM-Thinker-7B
57.684
61.842
66
70.158
Dec 4, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
ARM-Thinker-7B
Size=7B, Backbone=Qwen...
2025.12
73.7
Mini-o3
Source=Lai et al. (2025)
2025.12
73.3
Qwen3-VL-8B
Size=8B
2025.12
70.4
InternVL3.5-8B
Size=8B
2025.12
69.9
DeepEyes
Source=Lai et al. (2025)
2025.12
69.5
InternVL3-8B
Size=8B
2025.12
68.4
Pixel Reasoner
Source=Lai et al. (2025)
2025.12
66.9
Qwen2.5-VL-7B
Size=7B
2025.12
64.6
GPT-4o
Source=Lai et al. (2025)
2025.12
58.3
Feedback
Search any
task
Search any
task