Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Offline GUI Agent Evaluation on CAGUI (Full)
Loading...
90.3
Type Accuracy
Mimo-VL-7B + WildGUI
0.236
23.618
47
70.382
May 14, 2026
Type Accuracy
Step Success Rate
Updated 19d ago
Evaluation Results
Method
Method
Links
Type Accuracy
Step Success Rate
Mimo-VL-7B + WildGUI
Pre-training=WildGUI
2026.05
90.3
71
UI-TARS-7B
Model Category=Open-so...
2026.05
88.6
70.3
Qwen2.5-VL-7B* + WildGUI
Pre-training=WildGUI
2026.05
88.3
65.4
Mimo-VL-7B
Model Category=Open-so...
2026.05
82.2
63.4
OS-Atlas-7B
Model Category=Open-so...
2026.05
81.5
55.9
Qwen2.5-VL-7B*
Model Category=Open-so...
2026.05
74.2
55.2
Aguvis-7B
Model Category=Open-so...
2026.05
67.4
38.2
OS-Genesis-7B
Model Category=Open-so...
2026.05
38.1
14.5
GPT-4o
Model Category=Closed-...
2026.05
3.7
3.7
Feedback
Search any
task
Search any
task