Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Language-based decision-making on WebShop (test)
Loading...
81.4
Reward
AUTOGUIDE + Reflexion
60.08
65.615
71.15
76.685
Mar 13, 2024
Reward
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Reward
Success Rate
AUTOGUIDE + Reflexion
Offline data=true, Con...
2024.03
81.4
57
ReAct + Reflexion
Offline data=false, Co...
2024.03
77.1
51
AUTOGUIDE
Offline data=true, Con...
2024.03
73.4
46
ExpeL + Reflexion
Offline data=true, Con...
2024.03
71.7
42
ReAct
Offline data=false, Co...
2024.03
66.4
30
ExpeL
Offline data=true, Con...
2024.03
60.9
35
Feedback
Search any
task
Search any
task