| Dataset Name | SOTA Method | Metric | Trend | ||
|---|---|---|---|---|---|
| EmbodiedBench EB-ALFRED | A-Mem | Average Success Rate86.69 | 24 | 23d ago | |
| EB-ALFRED online unsupervised setting | ELITE | Success Rate (Avg)61 | 10 | 2mo ago | |
| ALFWorld v1 (ID) | AgentMark | Success Rate (%)100 | 9 | 3mo ago | |
| ALFWorld OOD v1 | AgentMark | SR0.994 | 7 | 3mo ago | |
| ALFWorld (held-out) | MemRL | SR97.9 | 6 | 2mo ago |