Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Task-oriented Plan Retrieval on Travel Itinerary Single-turn
Loading...
95
Pass Rate
In-context
22.2
41.1
60
78.9
Mar 1, 2026
Pass Rate
Token Count
Updated 1mo ago
Evaluation Results
Method
Method
Links
Pass Rate
Token Count
In-context
Model=Gemini 3 Flash,...
2026.03
95
12,846
Semantic XPath
Model=GPT-5 mini, Scor...
2026.03
85
5,161
Semantic XPath
Model=GPT-5 mini, Scor...
2026.03
75
5,548
Semantic XPath
Model=Gemini 3 Flash,...
2026.03
75
5,240
Semantic XPath
Model=Gemini 3 Flash,...
2026.03
65
5,454
In-context
Model=GPT-5 mini, Scor...
2026.03
55
12,136
Flat RAG
Model=Gemini 3 Flash,...
2026.03
35
3,349
Flat RAG
Model=GPT-5 mini, Scor...
2026.03
25
3,137
Feedback
Search any
task
Search any
task