Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Instruction Following on BabyAI Goto
Loading...
0.575
Average Episodic Reward
Poly-PPO
0.23284
0.32167
0.4105
0.49933
Sep 29, 2025
Average Episodic Reward
Success Rate (%)
Updated 4d ago
Evaluation Results
Method
Method
Links
Average Episodic Reward
Success Rate (%)
Poly-PPO
RL Algorithm=Poly-PPO,...
2025.09
0.575
80.2
Poly-PPO w/ UCB
RL Algorithm=Poly-PPO,...
2025.09
0.561
76.2
REINFORCE w/ UCB
RL Algorithm=REINFORCE...
2025.09
0.538
73.4
REINFORCE
RL Algorithm=REINFORCE...
2025.09
0.533
73
PPO w/ UCB
RL Algorithm=PPO, Expl...
2025.09
0.428
47.4
PPO
RL Algorithm=PPO, Expl...
2025.09
0.406
46.2
Pretrained policy
RL Algorithm=Pretraine...
2025.09
0.246
34.2
Feedback
Search any
task
Search any
task