Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Web Navigation on WebShop Drift I
Loading...
95
Success Rate
Generative Agent + GLOVE
-3.8
21.85
47.5
73.15
Jan 27, 2026
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Generative Agent + GLOVE
Backbone=GPT-4o, Agent...
2026.01
95
MemoryBank + GLOVE
Base LLM=Grok-3, Agent...
2026.01
90
Voyager + GLOVE
Base LLM=Grok-3, Agent...
2026.01
90
Generative Agent + GLOVE
Base LLM=Grok-3, Agent...
2026.01
90
MemoryBank + GLOVE
Backbone=GPT-4o, Agent...
2026.01
90
Voyager + GLOVE
Backbone=GPT-4o, Agent...
2026.01
90
Vanilla + GLOVE
Backbone=GPT-4o, Agent...
2026.01
85
Vanilla + GLOVE
Base LLM=Grok-3, Agent...
2026.01
80
MemoryBank
Base LLM=Grok-3, Agent...
2026.01
35
MemoryBank
Backbone=GPT-4o, Agent...
2026.01
20
No Memory (Plain)
Base LLM=Grok-3, Agent...
2026.01
5
No Memory (Plain)
Backbone=GPT-4o, Memor...
2026.01
5
Vanilla
Base LLM=Grok-3, Agent...
2026.01
0
Voyager
Base LLM=Grok-3, Agent...
2026.01
0
Generative Agent
Base LLM=Grok-3, Agent...
2026.01
0
Vanilla
Backbone=GPT-4o, Agent...
2026.01
0
Voyager
Backbone=GPT-4o, Agent...
2026.01
0
Generative Agent
Backbone=GPT-4o, Agent...
2026.01
0
Feedback
Search any
task
Search any
task