Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline GUI Agent Evaluation on AndroidControl Low
Loading...
98
Action Type Accuracy
UI-TARS-7B
72
78.75
85.5
92.25
May 14, 2026
Action Type Accuracy
Step Success Rate
Updated 19d ago
Evaluation Results
Method
Method
Links
Action Type Accuracy
Step Success Rate
UI-TARS-7B
Model Category=Open-so...
2026.05
98
90.8
Mimo-VL-7B + WildGUI
Pre-training=WildGUI
2026.05
95.5
91.8
Qwen2.5-VL-7B* + WildGUI
Pre-training=WildGUI
2026.05
94.9
90.3
Qwen2.5-VL-7B*
Model Category=Open-so...
2026.05
94.1
85
Aguvis-7B
Model Category=Open-so...
2026.05
93.9
89.4
Mimo-VL-7B
Model Category=Open-so...
2026.05
92.9
87.9
OS-Genesis-7B
Model Category=Open-so...
2026.05
90.7
74.2
GPT-4o
Model Category=Closed-...
2026.05
74.3
19.4
OS-Atlas-7B
Model Category=Open-so...
2026.05
73
67.3
Feedback
Search any
task
Search any
task