| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ALFWorld | MemRL | Success Rate (Last Epoch)94.9 | 10 | 1mo ago | |
| ALFWorld (val) | MemRL | Success Rate97.9 | 6 | 3mo ago | |
| Diabolical Lock H=100 (test) | RC-GVF | Mean Farthest Column Reached100 | 6 | 3mo ago | |
| Simulated House | Frontiers | CE2.18 | 5 | 1mo ago | |
| Simulated Large Warehouse | Frontiers | CE0.6 | 5 | 1mo ago | |
| environment Medium-scale | Average Completion Time (steps)479.8 | 4 | 1mo ago | ||
| HM3D Large | Average Completion Time (steps)738.6 | 4 | 1mo ago | ||
| HM3D Medium | Average Completion Time (steps)517.2 | 4 | 1mo ago | ||
| HM3D Small | Average Completion Time (steps)474.2 | 4 | 1mo ago | ||
| IsaacSim Ihlen (unseen) | SSNet | FCR0.139 | 4 | 3mo ago | |
| IsaacSim Beechwood (unseen) | MJO | Failure Case Rate (FCR)39.2 | 4 | 3mo ago | |
| IsaacSim Chemistry (unseen) | SSNet | FCR34.3 | 4 | 3mo ago | |
| Cave environment | FUEL-original | Time (s)93.6 | 3 | 1d ago | |
| Earthquake environment | FUEL-original | Time taken (s)107.3 | 3 | 1d ago | |
| Real Warehouse | Frontiers | CE2.11 | 3 | 1mo ago | |
| Large environment | COMRES-VLM | Average Completion Time (timesteps)323.02 | 3 | 3mo ago | |
| environment Medium | COMRES-VLM | Average Completion Time (timesteps)212.23 | 3 | 3mo ago | |
| Small environment | COMRES-VLM | Average Completion Time (timesteps)156.12 | 3 | 3mo ago | |
| MPE Large-Pass | LEMAE | Exploration Steps (thousands)446.9 | 2 | 1d ago | |
| MPE Secret-Room | CMAE | Exploration Steps (k)1,448,500 | 2 | 1d ago | |
| MPE Pass | LEMAE | Exploration Steps (k)153.1 | 2 | 1d ago | |
| MPE Push-Box | CMAE | Exploration Steps (k)972.3 | 2 | 1d ago | |
| ARC-AGI-3 | MAP | TU93 Level4 | 2 | 20d ago | |
| Dense Unstructured Environments | SaferPath | Success Rate96 | 2 | 3mo ago | |
| MiniHack | CAE+ | RPI (%)35.29 | 1 | 3mo ago |