Learning a Thousand Tasks in a Day
About
Humans are remarkably efficient at learning tasks from demonstrations, but today's imitation learning methods for robot manipulation often require hundreds or thousands of demonstrations per task. We investigate two fundamental priors for improving learning efficiency: decomposing manipulation trajectories into sequential alignment and interaction phases, and retrieval-based generalisation. Through 3,450 real-world rollouts, we systematically study this decomposition. We compare different design choices for the alignment and interaction phases, and examine generalisation and scaling trends relative to today's dominant paradigm of behavioural cloning with a single-phase monolithic policy. In the few-demonstrations-per-task regime (<10 demonstrations), decomposition achieves an order of magnitude improvement in data efficiency over single-phase learning, with retrieval consistently outperforming behavioural cloning for both alignment and interaction. Building on these insights, we develop Multi-Task Trajectory Transfer (MT3), an imitation learning method based on decomposition and retrieval. MT3 learns everyday manipulation tasks from as little as a single demonstration each, whilst also generalising to novel object instances. This efficiency enables us to teach a robot 1,000 distinct everyday tasks in under 24 hours of human demonstrator time. Through 2,200 additional real-world rollouts, we reveal MT3's capabilities and limitations across different task families. Videos of our experiments can be found on at https://www.robot-learning.uk/learning-1000-tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| PnP-Box | Real-world robot manipulation Static (OOD) | Success Rate58 | 14 | |
| Hang Cups | Real-world (Unseen) | Success Rate72 | 13 | |
| Hang Tape | Real-world robot manipulation Static (OOD) | Success Rate46 | 7 | |
| Hang-Cup | Real-world robot manipulation Static (OOD) | Success Rate74 | 7 | |
| open drawer | Real-world robot manipulation Static (OOD) | Success Rate76 | 7 | |
| Pour Water | Real-world robot manipulation Static (OOD) | Success Rate78 | 7 | |
| PnP-Box | cluttered environments with added distractor objects | Success Rate54 | 7 |