Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Hierarchical Planning on BabyAI Combined Skills 3
Loading...
2,454
Token Cost
TheoryCoder-2
-2,262.84
29,575.83
61,414.5
93,253.17
Jan 31, 2026
Token Cost
Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
Token Cost
Success Rate
TheoryCoder-2
Ablation Configuration...
2026.01
2,454
-
LLM + π
Agent Architecture=LLM...
2026.01
29,791
-
TheoryCoder-2
Ablation Configuration...
2026.01
45,017
-
TheoryCoder-2
Ablation Configuration...
2026.01
53,064
-
LLM + P
Agent Architecture=LLM...
2026.01
55,078
-
WorldCoder
Agent Architecture=Wor...
2026.01
120,375
-
Feedback
Search any
task
Search any
task