Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
GUI Navigation on Online-Mind2Web (Easy)
Loading...
78.31
Success Rate
Ovis2.5S-RF++
33.2052
44.9151
56.625
68.3349
Feb 14, 2026
Feb 25, 2026
Mar 9, 2026
Mar 21, 2026
Apr 2, 2026
Apr 14, 2026
Apr 26, 2026
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
Ovis2.5S-RF++
Training Method=REINFO...
2026.02
78.31
Ovis2.5S-GRPO
Training Method=GRPO (...
2026.02
77.11
Ovis2.5S-RLOO
Training Method=RLOO (...
2026.02
75.9
Claude 3.7 Sonnet CU
Model Type=Proprietary
2026.02
72.84
Claude 4 Sonnet CU
Model Type=Proprietary
2026.02
71.6
Ovis2.5SFT
Training Method=SFT
2026.02
66.27
PageGuide
Backbone=google/gemini...
2026.04
60.76
UI-TARS
Model Scale=7B, Versio...
2026.02
57.83
PageGuide
Backbone=google/gemini...
2026.04
52.5
SeeAct
Backbone=GPT-4
2026.04
51.8
Qwen3-VL
Model Scale=32B
2026.02
38.55
Qwen3-VL
Model Scale=8B
2026.02
34.94
Feedback
Search any
task
Search any
task