Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Hierarchical Planning on BabyAI Combined Skills 1
Loading...
1,961
Token Cost
TheoryCoder-2
-2,733.76
28,955.87
60,645.5
92,335.13
Jan 31, 2026
Token Cost
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Token Cost
Success Rate
TheoryCoder-2
Ablation Configuration...
2026.01
1,961
-
LLM + π
Agent Architecture=LLM...
2026.01
40,960
-
LLM + P
Agent Architecture=LLM...
2026.01
41,515
-
TheoryCoder-2
Ablation Configuration...
2026.01
44,725
-
TheoryCoder-2
Ablation Configuration...
2026.01
54,277
-
WorldCoder
Agent Architecture=Wor...
2026.01
119,330
-
Feedback
Search any
task
Search any
task