Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
GUI Action Prediction on Android Control High
Loading...
76.5
Task Match (TM)
Mobile-R1
58.092
62.871
67.65
72.429
Jun 25, 2025
Task Match (TM)
Exact Match (EM)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Task Match (TM)
Exact Match (EM)
Mobile-R1
Model Size=3B
2025.06
76.5
65.2
Qwen2.5-VL
Model Size=7B
2025.06
75.1
62.9
OS-Atlas
Model Size=7B
2025.06
70.4
56.5
OS-Genesis
Model Size=7B
2025.06
65.9
44.4
Aguvis
Model Size=7B
2025.06
65.6
54.2
Odyssey
Model Size=7B
2025.06
58.8
32.7
Feedback
Search any
task
Search any
task