Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Embodied Planning on AlfWorld (out of distribution)
Loading...
58.95
Accuracy
ProCeedSFT
10.8396
23.3298
35.82
48.3102
Apr 2, 2026
Accuracy
Updated 16d ago
Evaluation Results
Method
Method
Links
Accuracy
ProCeedSFT
Backbone=Qwen3-8B
2026.04
58.95
ProCeedRL
Backbone=Qwen3-8B
2026.04
55.22
DAPO
Backbone=Qwen3-8B
2026.04
53.24
RFT
Backbone=Qwen3-8B
2026.04
50.25
Qwen3-8B
Backbone=Qwen3-8B
2026.04
47.07
DAPO
Backbone=Qwen3-1.7B
2026.04
24.12
ProCeedRL
Backbone=Qwen3-1.7B
2026.04
23.33
Qwen3-1.7B
Backbone=Qwen3-1.7B
2026.04
12.69
Feedback
Search any
task
Search any
task