Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Planning on TravelPlanner (test)
Loading...
0.271
Success Rate
AutoRefine
0.0786
0.12855
0.1785
0.22845
Jan 30, 2026
Success Rate
Steps
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
Steps
AutoRefine
Backbone=GPT-4-turbo
2026.01
0.271
21.8
ReAct
Backbone=GPT-4-turbo
2026.01
0.104
26.1
ReAct + Reflexion
Backbone=GPT-4-turbo
2026.01
0.091
80.2
Reflexion
Backbone=GPT-4-turbo
2026.01
0.086
77.9
Feedback
Search any
task
Search any
task