Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
High-level planning on Simulated Tasks ≤2 actions (Short)
Loading...
97.19
Success Rate
Gemini
92.4892
93.7096
94.93
96.1504
Oct 13, 2024
Success Rate
Updated 1mo ago
Evaluation Results
Method
Method
Links
Success Rate
Gemini
Planning LLM=Gemini, D...
2024.10
97.19
GPT
Planning LLM=GPT, DINO...
2024.10
96
GPT
Planning LLM=GPT, DINO...
2024.10
93
GPT
Planning LLM=GPT, DINO...
2024.10
92.67
Feedback
Search any
task
Search any
task