Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Plan Generation on Robo Challenge (Offline)
Loading...
100
Plan Accuracy
CP-SAT Formalizer
9.416
32.933
56.45
79.967
May 31, 2026
Plan Accuracy
Updated 1d ago
Evaluation Results
Method
Method
Links
Plan Accuracy
CP-SAT Formalizer
Model=Gemini-3-flash
2026.05
100
CP-SAT Formalizer
Model=GPT-5-mini
2026.05
99.3
CP-SAT Formalizer
Model=DeepSeek-V4-Flash
2026.05
97.9
CP-SAT Formalizer
Model=Qwen3.6 35B A3B
2026.05
97.9
PDDL2.1 Formalizer
Model=Qwen3.6 35B A3B
2026.05
67.9
Planner
Model=Qwen3.6 35B A3B
2026.05
59.3
PDDL2.1 Formalizer
Model=GPT-5-mini
2026.05
55.7
PDDL2.1 Formalizer
Model=Gemini-3-flash
2026.05
52.9
Planner
Model=GPT-5-mini
2026.05
46.4
Planner
Model=Gemini-3-flash
2026.05
40.7
PDDL2.1 Formalizer
Model=DeepSeek-V4-Flash
2026.05
27.9
Planner
Model=DeepSeek-V4-Flash
2026.05
12.9
Feedback
Search any
task
Search any
task