Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

AlfWorld

Benchmarks

Task NameDataset NameSOTA ResultTrend
Embodied ReasoningALFWorld
Accuracy0.96
151
Instruction FollowingALFWorld
Accuracy89.3
82
Interactive Decision MakingALFWorld (test)
Success Rate96.87
67
Interactive Decision-makingALFWorld
PICK100
52
Embodied decision-makingALFWorld
Success Rate82.84
31
Agent TaskAlfWorld
Success Rate83.6
21
Interactive environment task successALFWorld (test)
Overall Success Rate91.79
20
Agentic task completionALFWorld
Look Success100
18
Agent Behavior AdaptationAlfWorld (AW) (test)
Loop Ratio1,040
17
Embodied Task PlanningALFWorld (test)
Success Rate (Avg)98.7
17
Next-state predictionALFWorld (AW)
EM Accuracy99.87
16
Embodied Task ExecutionALFWorld
Success Rate16.6
15
Interactive Decision MakingALFWorld unseen (test)
Pick Success98.4
14
Task successALFWorld
Real Success91
14
Embodied InteractionALFWorld
Success Rate48.97
14
Embodied agentALFWorld Unseen
Average Reward90.3
12
Embodied agentALFWorld Seen
Average Reward87.2
12
Embodied instruction followingALFWorld official (val)
Success Rate65.3
12
Household simulationALFWorld (out-of-distribution)
Put Success Rate75
12
Agentic Task SuccessALFWorld
Success Rate87.1
12
Embodied instruction followingAlfworld
Progress Rate96.3
11
Action Success RateALFWorld
Average Success Rate0.42
10
Interactive Instruction FollowingALFWorld OOD
Success Rate90.9
9
Interactive Instruction FollowingALFWorld (train)
Success Rate90
9
Embodied AI task executionALFWorld v1 (ID)
Success Rate (%)100
9
Showing 25 of 42 rows