Rapid Exploration for Open-World Navigation with Latent Goal Models
About
We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration. Trained on a large offline dataset of prior experience, the model acquires a representation of visual goals that is robust to task-irrelevant distractors. We demonstrate our method on a mobile ground robot in open-world exploration scenarios. Given an image of a goal that is up to 80 meters away, our method leverages its representation to explore and discover the goal in under 20 minutes, even amidst previously-unseen obstacles and weather conditions. Please check out the project website for videos of our experiments and information about the real-world dataset used at https://sites.google.com/view/recon-robot.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Goal-conditioned Reinforcement Learning | manipulation-cube-single-play (test) | Success Rate0.9 | 11 | |
| Goal-conditioned Reinforcement Learning | pointmaze navigate medium | Success Rate69 | 11 | |
| Goal-Conditioned Reinforcement Learning (Manipulation) | puzzle-3x3-play state-based v0 (test) | Success Rate14 | 6 | |
| Goal-Conditioned Reinforcement Learning (Manipulation) | scene-play state-based v0 (test) | Success Rate58 | 6 | |
| Goal-Conditioned Reinforcement Learning (Navigation) | pointmaze-large-navigate state-based v0 (test) | Success Rate50 | 6 | |
| Goal-Conditioned Reinforcement Learning (Navigation) | antmaze-giant-navigate state-based v0 (test) | Success Rate0.00e+0 | 6 | |
| Goal-Conditioned Reinforcement Learning (Navigation) | humanoidmaze-large-navigate state-based v0 (test) | Success Rate3 | 6 | |
| Goal-Conditioned Reinforcement Learning (Navigation) | antsoccer-arena-navigate state-based v0 (test) | Success Rate34 | 6 | |
| Goal-Conditioned Reinforcement Learning (Manipulation) | cube-double-play state-based v0 (test) | Success Rate33 | 6 | |
| Goal-Conditioned Reinforcement Learning (Navigation) | antmaze medium-navigate state-based v0 (test) | Success Rate68 | 6 |