Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Rapid Exploration for Open-World Navigation with Latent Goal Models

About

We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration. Trained on a large offline dataset of prior experience, the model acquires a representation of visual goals that is robust to task-irrelevant distractors. We demonstrate our method on a mobile ground robot in open-world exploration scenarios. Given an image of a goal that is up to 80 meters away, our method leverages its representation to explore and discover the goal in under 20 minutes, even amidst previously-unseen obstacles and weather conditions. Please check out the project website for videos of our experiments and information about the real-world dataset used at https://sites.google.com/view/recon-robot.

Dhruv Shah, Benjamin Eysenbach, Gregory Kahn, Nicholas Rhinehart, Sergey Levine• 2021

Related benchmarks

TaskDatasetResultRank
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.9
11
Goal-conditioned Reinforcement Learningpointmaze navigate medium
Success Rate69
11
Goal-Conditioned Reinforcement Learning (Manipulation)puzzle-3x3-play state-based v0 (test)
Success Rate14
6
Goal-Conditioned Reinforcement Learning (Manipulation)scene-play state-based v0 (test)
Success Rate58
6
Goal-Conditioned Reinforcement Learning (Navigation)pointmaze-large-navigate state-based v0 (test)
Success Rate50
6
Goal-Conditioned Reinforcement Learning (Navigation)antmaze-giant-navigate state-based v0 (test)
Success Rate0.00e+0
6
Goal-Conditioned Reinforcement Learning (Navigation)humanoidmaze-large-navigate state-based v0 (test)
Success Rate3
6
Goal-Conditioned Reinforcement Learning (Navigation)antsoccer-arena-navigate state-based v0 (test)
Success Rate34
6
Goal-Conditioned Reinforcement Learning (Manipulation)cube-double-play state-based v0 (test)
Success Rate33
6
Goal-Conditioned Reinforcement Learning (Navigation)antmaze medium-navigate state-based v0 (test)
Success Rate68
6
Showing 10 of 13 rows

Other info

Follow for update