Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Hierarchical Planning on BabyAI Pickup
Loading...
2,405
Token Cost
LLM + π
1,677.64
6,587.32
11,497
16,406.68
Jan 31, 2026
Token Cost
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Token Cost
Success Rate
LLM + π
Agent Architecture=LLM...
2026.01
2,405
-
TheoryCoder-2
Ablation Configuration...
2026.01
6,660
-
TheoryCoder-2
Ablation Configuration...
2026.01
8,588
-
TheoryCoder-2
Ablation Configuration...
2026.01
8,588
-
WorldCoder
Agent Architecture=Wor...
2026.01
18,013
-
LLM + P
Agent Architecture=LLM...
2026.01
20,589
-
Feedback
Search any
task
Search any
task