Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
GUI Action Prediction on AITZ
Loading...
77.05
Task Match (TM)
Mobile-R1
17.6972
33.1061
48.515
63.9239
Jun 25, 2025
Task Match (TM)
Execution Match (EM)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Task Match (TM)
Execution Match (EM)
Mobile-R1
Model Size=3B
2025.06
77.05
60.5
OS-Atlas
Model Size=7B
2025.06
74.13
58.45
Odyssey
Model Size=7B
2025.06
59.17
31.6
Aguvis
Model Size=7B
2025.06
35.71
18.99
OS-Genesis
Model Size=7B
2025.06
19.98
8.45
Feedback
Search any
task
Search any
task