Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
GUI Navigation on Online-Mind2Web (Hard)
Loading...
53.52
Success Rate
Claude 4 Sonnet CU
7.6976
19.5938
31.49
43.3862
Feb 14, 2026
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Claude 4 Sonnet CU
Model Type=Proprietary
2026.02
53.52
Ovis2.5S-GRPO
Training Method=GRPO (...
2026.02
51.35
Ovis2.5S-RLOO
Training Method=RLOO (...
2026.02
51.35
Claude 3.7 Sonnet CU
Model Type=Proprietary
2026.02
46.48
Ovis2.5S-RF++
Training Method=REINFO...
2026.02
44.59
Ovis2.5SFT
Training Method=SFT
2026.02
39.19
UI-TARS
Model Scale=7B, Versio...
2026.02
17.57
Qwen3-VL
Model Scale=32B
2026.02
14.86
Qwen3-VL
Model Scale=8B
2026.02
9.46
Feedback
Search any
task
Search any
task