| VoTa-Bench (Seen) | D2PO | SR (Examine&Light)84.72 | | 21 | 1mo ago |
| ALFWorld (test) | AEC | Success Rate (Avg)98.7 | | 17 | 1mo ago |
| Robotouille Synchronous | GiG+Exp | Pass@1 Accuracy97 | | 15 | 1mo ago |
| Robotouille Asynchronous (test) | GiG+Exp | Pass@1 Accuracy86 | | 15 | 1mo ago |
| VoTa-Bench 1.0 (Unseen) | D2PO | Examine&Light SR82.27 | | 15 | 1mo ago |
| ALFWorld visual observation | RoboAgent | Avg Success Rate77.6 | | 10 | 9d ago |
| VirtualHome (Seen) | GPT3.5-MCTS | Simple Success9,140 | | 10 | 1mo ago |
| ALFWorld textual observation (unseen) | RoboAgent | Success Rate94 | | 9 | 9d ago |
| ALFWorld textual observation (seen) | DynaMind | Success Rate92.5 | | 9 | 9d ago |
| RoboTwin Library Scene | RoboPARA | TEI1.32 | | 8 | 1mo ago |
| RoboTwin Pet Shop Scene | RoboPARA | TEI1.36 | | 8 | 1mo ago |
| RoboTwin Hotel Scene | RoboPARA | TEI1.12 | | 8 | 1mo ago |
| ALFWorld standard evaluation set (134 tasks) | GiG | Pass@1 Accuracy97 | | 7 | 1mo ago |
| EB-Habitat (OOD) | GPT-4o | Success Rate59 | | 6 | 9d ago |
| RLBench Unseen domains | TMoW | Success Rate62.75 | | 6 | 1mo ago |
| ALFWorld (unseen domains) | TMoW | Success Rate (SR)68.83 | | 6 | 1mo ago |
| VirtualHome (unseen domains) | TMoW | Success Rate80.16 | | 6 | 1mo ago |
| RLBench Seen domains | TMoW | Success Rate71.89 | | 6 | 1mo ago |
| ALFWorld (seen domains) | TMoW | Success Rate (SR)72.05 | | 6 | 1mo ago |
| VirtualHome Novel Apartment (Unseen) | GPT3.5-MCTS | Simple Success Rate82.9 | | 4 | 1mo ago |