Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Embodied Decision-making on ALFWorld (ID and OOD)
Loading...
86.7
ID Success Rate
RWML + Policy RL
75.78
78.615
81.45
84.285
Feb 5, 2026
ID Success Rate
OOD Success Rate
Average Success Rate
Updated 4d ago
Evaluation Results
Method
Method
Links
ID Success Rate
OOD Success Rate
Average Success Rate
RWML + Policy RL
Training Paradigm=Self...
2026.02
86.7
90.1
87.9
IWM
Training Paradigm=Lear...
2026.02
85.6
78.1
83.1
Imitation Learning
Training Paradigm=Lear...
2026.02
84.9
77.6
82.5
SR
Training Paradigm=Lear...
2026.02
83.9
82.3
83.3
WM SFT + Policy RL
Training Paradigm=Self...
2026.02
76.2
82.3
80.4
Feedback
Search any
task
Search any
task