| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Embodied Reasoning | ALFWorld | Accuracy0.96 | 151 | |
| Instruction Following | ALFWorld | Accuracy89.3 | 82 | |
| Interactive Decision Making | ALFWorld (test) | Success Rate96.87 | 67 | |
| Interactive Decision-making | ALFWorld | PICK100 | 52 | |
| Embodied decision-making | ALFWorld | Success Rate82.84 | 31 | |
| Agent Task | AlfWorld | Success Rate83.6 | 21 | |
| Interactive environment task success | ALFWorld (test) | Overall Success Rate91.79 | 20 | |
| Agentic task completion | ALFWorld | Look Success100 | 18 | |
| Agent Behavior Adaptation | AlfWorld (AW) (test) | Loop Ratio1,040 | 17 | |
| Embodied Task Planning | ALFWorld (test) | Success Rate (Avg)98.7 | 17 | |
| Next-state prediction | ALFWorld (AW) | EM Accuracy99.87 | 16 | |
| Embodied Task Execution | ALFWorld | Success Rate16.6 | 15 | |
| Interactive Decision Making | ALFWorld unseen (test) | Pick Success98.4 | 14 | |
| Task success | ALFWorld | Real Success91 | 14 | |
| Embodied Interaction | ALFWorld | Success Rate48.97 | 14 | |
| Embodied agent | ALFWorld Unseen | Average Reward90.3 | 12 | |
| Embodied agent | ALFWorld Seen | Average Reward87.2 | 12 | |
| Embodied instruction following | ALFWorld official (val) | Success Rate65.3 | 12 | |
| Household simulation | ALFWorld (out-of-distribution) | Put Success Rate75 | 12 | |
| Agentic Task Success | ALFWorld | Success Rate87.1 | 12 | |
| Embodied instruction following | Alfworld | Progress Rate96.3 | 11 | |
| Action Success Rate | ALFWorld | Average Success Rate0.42 | 10 | |
| Interactive Instruction Following | ALFWorld OOD | Success Rate90.9 | 9 | |
| Interactive Instruction Following | ALFWorld (train) | Success Rate90 | 9 | |
| Embodied AI task execution | ALFWorld v1 (ID) | Success Rate (%)100 | 9 |