Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
General VQA on GUIChat
Loading...
93.14
Accuracy
Claude 4 sonnet
44.3848
57.0424
69.7
82.3576
Jan 26, 2026
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Claude 4 sonnet
Model Category=Closed-...
2026.01
93.14
Qwen 2.5 VL 72B
Model Category=Open-So...
2026.01
88.01
Qwen 2.5 VL 32B
Model Category=Open-So...
2026.01
85.21
Gemini 2.5 flash
Model Category=Closed-...
2026.01
83.05
InternVL3 78B
Model Category=Open-So...
2026.01
79.83
Qwen 2.5 VL 72B + Tools
Model Category=Tool-Pl...
2026.01
77.13
GPT 5 + Tools
Model Category=Tool-Pl...
2026.01
76.51
AdaReasoner 7B
Model Category=Tool-Pl...
2026.01
73.91
PixelReasoner
Model Category=Tool-Pl...
2026.01
72.45
GPT 5
Model Category=Closed-...
2026.01
71.41
Qwen 2.5 VL 7B
Model Category=Open-So...
2026.01
68.09
DeepEyes
Model Category=Tool-Pl...
2026.01
65.9
Qwen 2.5 VL 7B + Tools
Model Category=Tool-Pl...
2026.01
56.76
Qwen 2.5 VL 3B
Model Category=Open-So...
2026.01
46.26
Feedback
Search any
task
Search any
task