Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
GUI reasoning on AndroidControl High
Loading...
65.91
Type
CAPO
47.086
51.973
56.86
61.747
Dec 2, 2025
Type
GR (Goal Rate)
SR (Success Rate)
Updated 3d ago
Evaluation Results
Method
Method
Links
Type
GR (Goal Rate)
SR (Success Rate)
CAPO
Prompting=zero-shot
2025.12
65.91
61.47
47.71
GRPO
Prompting=zero-shot
2025.12
60.1
58.25
46.81
Os-Atlas-4B
Prompting=zero-shot
2025.12
49.01
49.51
22.77
QwenVL2.5-3B
Prompting=zero-shot
2025.12
47.81
46.51
38.9
Feedback
Search any
task
Search any
task