Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Web Task on Mind2Web (test)
Loading...
31.3
Step-wise Success Rate
LUMOS-I_Web-13B
1.14
8.97
16.8
24.63
Nov 9, 2023
Step-wise Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Step-wise Success Rate
LUMOS-I_Web-13B
Type=Open-source, Fine...
2023.11
31.3
LUMOS-I_Web
Type=Open-source, Fine...
2023.11
27.6
GPT-4
Type=GPT/API-based
2023.11
22.6
GPT-3.5-turbo
Type=GPT/API-based
2023.11
15.7
AgentLM-70B
Type=Open-source, Fine...
2023.11
13.5
Koala-13B
Type=Open-source
2023.11
6
WizardLM-30B
Type=Open-source
2023.11
3.1
Baichuan-13B-chat
Type=Open-source
2023.11
2.3
Feedback
Search any
task
Search any
task