Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
GUI Navigation on Online-Mind2Web (Medium)
Loading...
66.91
Success Rate
Claude 3.7 Sonnet CU
24.2284
35.3092
46.39
57.4708
Feb 14, 2026
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Claude 3.7 Sonnet CU
Model Type=Proprietary
2026.02
66.91
Ovis2.5S-RF++
Training Method=REINFO...
2026.02
65.73
Claude 4 Sonnet CU
Model Type=Proprietary
2026.02
65.47
Ovis2.5S-GRPO
Training Method=GRPO (...
2026.02
62.94
Ovis2.5S-RLOO
Training Method=RLOO (...
2026.02
60.84
Ovis2.5SFT
Training Method=SFT
2026.02
51.75
Qwen3-VL
Model Scale=32B
2026.02
27.97
UI-TARS
Model Scale=7B, Versio...
2026.02
27.27
Qwen3-VL
Model Scale=8B
2026.02
25.87
Feedback
Search any
task
Search any
task