Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Mobile-use agent task completion and intent alignment on OS-Kairos user-specific Modified
Loading...
69.74
Success Rate (SR)
IFRAgent
22.4512
34.7281
47.005
59.2819
Aug 12, 2025
Success Rate (SR)
Type Accuracy (Type(%))
Intent Alignment Rate (IAR(%))
Updated 13d ago
Evaluation Results
Method
Method
Links
Success Rate (SR)
Type Accuracy (Type(%))
Intent Alignment Rate (IAR(%))
IFRAgent
Base Model=OS-Atlas-7B...
2025.08
69.74
85.31
68.42
UI-TARS-1.5-7B
Base Model=UI-TARS-1.5-7B
2025.08
61.49
75.86
53.24
IFRAgent
Base Model=UI-TARS-1.5-7B
2025.08
60.75
77.04
58.8
OS-Atlas-7B-Pro
Base Model=OS-Atlas-7B...
2025.08
58.85
79.52
53.78
IFRAgent
Base Model=GPT-4o+OCR...
2025.08
56.33
77.36
53.75
IFRAgent
Base Model=Qwen2.5-VL-7B
2025.08
51.64
59.62
50.31
GPT-4o+OCR model
Base Model=GPT-4o+OCR...
2025.08
40.65
68.92
37.67
Qwen2.5-VL-7B
Base Model=Qwen2.5-VL-7B
2025.08
24.27
27.77
23.64
Feedback
Search any
task
Search any
task