GELLO: A General, Low-Cost, and Intuitive Teleoperation Framework for Robot Manipulators
About
Humans can teleoperate robots to accomplish complex manipulation tasks. Imitation learning has emerged as a powerful framework that leverages human teleoperated demonstrations to teach robots new skills. However, the performance of the learned policies is bottlenecked by the quality, scale, and variety of the demonstration data. In this paper, we aim to lower the barrier to collecting large and high-quality human demonstration data by proposing a GEneraL framework for building LOw-cost and intuitive teleoperation systems for robotic manipulation (GELLO). Given a target robot arm, we build a GELLO controller device that has the same kinematic structure as the target arm, leveraging 3D-printed parts and economical off-the-shelf motors. GELLO is easy to build and intuitive to use. Through an extensive user study, we show that GELLO enables more reliable and efficient demonstration collection compared to other cost efficient teleoperation devices commonly used in the imitation learning literature such as virtual reality controllers and 3D spacemouses. We further demonstrate the capabilities of GELLO for performing complex bi-manual and contact-rich manipulation tasks. To make GELLO accessible to everyone, we have designed and built GELLO systems for 3 commonly used robotic arms: Franka, UR5, and xArm. All software and hardware are open-sourced and can be found on our website: https://wuphilipp.github.io/gello/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bowl Opening | Bowl Opening | Time (s)1.08e+4 | 5 | |
| Box Opening | Box Opening | Time (s)1.76e+4 | 5 | |
| Cup Handover | Cup Handover | Execution Time (s)899 | 5 | |
| Pen Picking | Pen Picking | Latency (s)425 | 5 | |
| Rice Scooping | Rice Scooping | Latency (s)529 | 5 | |
| Salt Scooping | Salt Scooping | Time (s)5.81e+3 | 5 | |
| Soybeans Scooping | Soybeans Scooping | Time (s)6.22e+3 | 5 | |
| Spoon Grasping | Spoon Grasping | Time (s)2.31e+4 | 5 | |
| Towel Folding | Towel Folding | Time (s)1.79e+4 | 5 | |
| Water Pouring | Water Pouring | Time (s)9.31e+3 | 5 |