Behavior Transformers: Cloning $k$ modes with one stone
About
While behavior learning has made impressive progress in recent times, it lags behind computer vision and natural language processing due to its inability to leverage large, human-generated datasets. Human behaviors have wide variance, multiple modes, and human demonstrations typically do not come with reward labels. These properties limit the applicability of current methods in Offline RL and Behavioral Cloning to learn from large, pre-collected datasets. In this work, we present Behavior Transformer (BeT), a new technique to model unlabeled demonstration data with multiple modes. BeT retrofits standard transformer architectures with action discretization coupled with a multi-task action correction inspired by offset prediction in object detection. This allows us to leverage the multi-modal modeling ability of modern transformers to predict multi-modal continuous actions. We experimentally evaluate BeT on a variety of robotic manipulation and self-driving behavior datasets. We show that BeT significantly improves over prior state-of-the-art work on solving demonstrated tasks while capturing the major modes present in the pre-collected datasets. Finally, through an extensive ablation study, we analyze the importance of every crucial component in BeT. Videos of behavior generated by BeT are available at https://notmahi.github.io/bet
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Hang | ManiSkill2 | Success Rate18.6 | 14 | |
| PickCube | ManiSkill2 | Success Rate82.6 | 14 | |
| StackCube | ManiSkill 2 | Success Rate81.2 | 14 | |
| PickYCB | ManiSkill 2 | Success Rate34.6 | 14 | |
| Fill | ManiSkill2 | Success Rate10.6 | 14 | |
| Bimanual Insertion | Bimanual Insertion sim | Grasp Success21 | 10 | |
| Cube Transfer | Cube Transfer sim | Touched Count60 | 10 | |
| Square | RoboMimic MH 300 trajectories Full (multi-human) | Success Rate43 | 9 | |
| Transport | RoboMimic multi-human 300 trajectories Full | Success Rate6 | 9 | |
| Lift | RoboMimic MH 300 trajectories Full (multi-human) | Success Rate99 | 5 |