Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Hierarchical Planning on BabyAI Combined Skills 2
Loading...
2,528
Token Cost
TheoryCoder-2
-2,178.88
29,592.56
61,364
93,135.44
Jan 31, 2026
Token Cost
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Token Cost
Success Rate
TheoryCoder-2
Ablation Configuration...
2026.01
2,528
-
TheoryCoder-2
Ablation Configuration...
2026.01
45,175
-
LLM + π
Agent Architecture=LLM...
2026.01
49,973
-
TheoryCoder-2
Ablation Configuration...
2026.01
53,376
-
LLM + P
Agent Architecture=LLM...
2026.01
59,003
-
WorldCoder
Agent Architecture=Wor...
2026.01
120,200
-
Feedback
Search any
task
Search any
task