Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
GUI Navigation on Mind2Web Online (Average)
Loading...
64
Success Rate
Ovis2.5S-GRPO
22.7432
33.4541
44.165
54.8759
Feb 14, 2026
Success Rate
Improvement over SFT
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Improvement over SFT
Ovis2.5S-GRPO
Training Method=GRPO (...
2026.02
64
11.33
Ovis2.5S-RF++
Training Method=REINFO...
2026.02
64
11.33
Ovis2.5S-RLOO
Training Method=RLOO (...
2026.02
62.67
10
Claude 4 Sonnet CU
Model Type=Proprietary
2026.02
62.33
-
Claude 3.7 Sonnet CU
Model Type=Proprietary
2026.02
61
-
Ovis2.5SFT
Training Method=SFT
2026.02
52.67
-
GPT-4o
Model Type=Proprietary
2026.02
37
-
UI-TARS
Model Scale=7B, Versio...
2026.02
33.33
-
Qwen3-VL
Model Scale=32B
2026.02
27.67
-
Qwen3-VL
Model Scale=8B
2026.02
24.33
-
Feedback
Search any
task
Search any
task