Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

BabyAI

Benchmarks

Task NameDataset NameSOTA ResultTrend
LLM Agent NavigationBabyAI (test)
Success Rate93.3
25
Instruction FollowingBabyAI
Success Rate72.56
14
Instruction FollowingBabyAI BossLevel
Success Rate96.2
14
Imitation LearningBabyAI BossLevel (test)
Success Rate72
9
Imitation LearningBabyAI SynthSeq (test)
Success Rate0.642
9
Imitation LearningBabyAI GoToSeq (test)
Success Rate77.2
9
Instruction FollowingBabyAI Synthseq
Average Episodic Reward0.361
7
Instruction FollowingBabyAI Pickup
Average Episodic Reward0.486
7
Instruction FollowingBabyAI Goto
Average Episodic Reward0.575
7
BosslevelBabyAI
Average Pass Rate0.343
7
SynthseqBabyAI
Average Pass Rate32.1
7
PickupBabyAI
Average Pass Rate33.4
7
GotoBabyAI
Average Pass Rate0.606
7
Representational AlignmentBabyAI instruction set
P@1046.65
7
Hierarchical PlanningBabyAI Combined Skills 3
Token Cost2,454
6
Hierarchical PlanningBabyAI Combined Skills 2
Token Cost2,528
6
Hierarchical PlanningBabyAI Combined Skills 1
Token Cost1,961
6
Hierarchical PlanningBabyAI Unlock
Token Cost5,705
6
Hierarchical PlanningBabyAI Pickup
Token Cost2,405
6
NavigationBabyAI
Success Rate93.2
2
Showing 20 of 20 rows