Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Rapid Exploration for Open-World Navigation with Latent Goal Models

About

We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration. Trained on a large offline dataset of prior experience, the model acquires a representation of visual goals that is robust to task-irrelevant distractors. We demonstrate our method on a mobile ground robot in open-world exploration scenarios. Given an image of a goal that is up to 80 meters away, our method leverages its representation to explore and discover the goal in under 20 minutes, even amidst previously-unseen obstacles and weather conditions. Please check out the project website for videos of our experiments and information about the real-world dataset used at https://sites.google.com/view/recon-robot.

Dhruv Shah, Benjamin Eysenbach, Gregory Kahn, Nicholas Rhinehart, Sergey Levine• 2021

Related benchmarks

TaskDatasetResultRank
Offline Reinforcement LearningD4RL Franka Kitchen
Mixed Success Rate81
43
Robotic ManipulationD4RL Kitchen-Partial
Normalized Score92
23
Robotic ManipulationD4RL Kitchen-Mixed--
14
Goal-conditioned Reinforcement Learningmanipulation-cube-single-play (test)
Success Rate0.9
11
Goal-conditioned Reinforcement Learningpointmaze navigate medium
Success Rate69
11
Offline goal-conditioned RLOGBench Manipulation
Success Rate (Cube Single)90
9
Robotic ManipulationD4RL kitchen-complete
Slide Cabinet Success Rate25
9
Offline goal-conditioned RLOGBench Navigation
Success Rate (PointMaze-Medium)69
9
Goal-Conditioned Reinforcement Learning (Manipulation)puzzle-3x3-play state-based v0 (test)
Success Rate14
6
Goal-Conditioned Reinforcement Learning (Manipulation)scene-play state-based v0 (test)
Success Rate58
6
Showing 10 of 19 rows

Other info

Follow for update