Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Web Navigation on MiniWob++
Loading...
53.26
Accuracy
Explorer-7B
8.0408
19.7804
31.52
43.2596
Feb 17, 2025
Accuracy
Updated 4d ago
Evaluation Results
Method
Method
Links
Accuracy
Explorer-7B
Parameters=7B, zero-sh...
2025.02
53.26
GPT-4
Model Type=API-based,...
2025.02
53.04
Llama3-chat-70B
Model Type=Open-source...
2025.02
48.7
Explorer-4B
Parameters=4B, zero-sh...
2025.02
46.74
AgentTrek-7B
Model Type=Open-source...
2025.02
45.28
GPT-3.5
Model Type=API-based,...
2025.02
39.57
Synatra-CodeLlama-7B
Model Type=Open-source...
2025.02
38.2
Qwen2-VL-7B
Model Type=Open-source...
2025.02
36.96
AgentLM-70B
Model Type=Open-source...
2025.02
36.52
Phi-3.5V
Model Type=Open-source...
2025.02
35.87
Llama3-chat-8B
Model Type=Open-source...
2025.02
31.74
Lemur-chat-70B
Model Type=Open-source...
2025.02
21.3
AgentFlan-7B
Model Type=Open-source...
2025.02
20.87
AgentLM-7B
Model Type=Open-source...
2025.02
15.65
CodeActAgent-7B
Model Type=Open-source...
2025.02
9.78
Feedback
Search any
task
Search any
task