| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| ALFWorld Seen v1 (in-distribution) | ProCeedSFT | Average Success Rate57.14 | 14 | 15d ago | |
| ALFRED | LoTA (Full Recompute) | Success Rate (SR)45.81 | 11 | 1mo ago | |
| AlfWorld (out of distribution) | ProCeedSFT | Accuracy58.95 | 8 | 15d ago | |
| AI2-THOR | SVLL-Stage 3 | SR78.35 | 8 | 1mo ago | |
| MRoom-30k 1.0 (test) | OOWM 3-Stage | Similarity56.94 | 6 | 5d ago | |
| ALFWorld OOD v1 | GFlowVLM w/ SubTB | Success Rate12.3 | 6 | 1mo ago | |
| ScanNet 3D-LLM | LSceneLLM | ROUGE47.05 | 4 | 1mo ago | |
| 9 real-world robotic tasks zero-shot | SVLL-Stage 3 | Success Rate (SR)55.56 | 2 | 1mo ago |