Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Trip Planning on Natural Plan Trip Planning
Loading...
33.2
Success Rate
SMaRT
4.496
11.948
19.4
26.852
Oct 20, 2025
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Success Rate
SMaRT
Model=Gemini-1.5, Numb...
2025.10
33.2
Direct
Model=Gemini-1.5, Numb...
2025.10
32.2
CoT
Model=Gemini-1.5, Numb...
2025.10
31.2
LLM-as-a-Judge
Model=Gemini-1.5, Numb...
2025.10
29.2
SMaRT
Model=GPT-4, Number of...
2025.10
24.5
CoT
Model=GPT-4, Number of...
2025.10
9.4
LLM-as-a-Judge
Model=GPT-4, Number of...
2025.10
6.2
Direct
Model=GPT-4, Number of...
2025.10
5.6
Feedback
Search any
task
Search any
task