Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Evaluator Accuracy on AndroidBench
Loading...
94.9
Overall Accuracy
GUIDE
69.212
75.881
82.55
89.219
Apr 6, 2026
Overall Accuracy
AitW Accuracy
AutoUI (Base) Accuracy
AutoUI (Large) Accuracy
CogAgent Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Overall Accuracy
AitW Accuracy
AutoUI (Base) Accuracy
AutoUI (Large) Accuracy
CogAgent Accuracy
GUIDE
Backbone=gemini-3.0-flash
2026.04
94.9
85.6
98.1
100
95.8
Cap.+Mixtral
2026.04
92.9
-
-
-
-
WebJudge
Backbone=gemini-3.0-flash
2026.04
92.6
83.6
93.3
100
93.3
AgentTrek
Backbone=gemini-3.0-flash
2026.04
92.4
86.1
93.9
99.2
90.3
GPT-4V
2026.04
90.6
-
-
-
-
Cap.+GPT-4
2026.04
89.8
-
-
-
-
QWen-VL
2026.04
70.2
-
-
-
-
Feedback
Search any
task
Search any
task