Rapid Exploration for Open-World Navigation with Latent Goal Models

About

We describe a robotic learning system for autonomous exploration and navigation in diverse, open-world environments. At the core of our method is a learned latent variable model of distances and actions, along with a non-parametric topological memory of images. We use an information bottleneck to regularize the learned policy, giving us (i) a compact visual representation of goals, (ii) improved generalization capabilities, and (iii) a mechanism for sampling feasible goals for exploration. Trained on a large offline dataset of prior experience, the model acquires a representation of visual goals that is robust to task-irrelevant distractors. We demonstrate our method on a mobile ground robot in open-world exploration scenarios. Given an image of a goal that is up to 80 meters away, our method leverages its representation to explore and discover the goal in under 20 minutes, even amidst previously-unseen obstacles and weather conditions. Please check out the project website for videos of our experiments and information about the real-world dataset used at https://sites.google.com/view/recon-robot.

Dhruv Shah, Benjamin Eysenbach, Gregory Kahn, Nicholas Rhinehart, Sergey Levine• 2021

Related benchmarks

Task	Dataset	Result
Offline Reinforcement Learning	D4RL Franka Kitchen	Mixed Success Rate81	43
Robotic Manipulation	D4RL Kitchen-Partial	Normalized Score92	23
Robotic Manipulation	D4RL Kitchen-Mixed	--	14
Goal-conditioned Reinforcement Learning	manipulation-cube-single-play (test)	Success Rate0.9	11
Goal-conditioned Reinforcement Learning	pointmaze navigate medium	Success Rate69	11
Offline goal-conditioned RL	OGBench Manipulation	Success Rate (Cube Single)90	9
Robotic Manipulation	D4RL kitchen-complete	Slide Cabinet Success Rate25	9
Offline goal-conditioned RL	OGBench Navigation	Success Rate (PointMaze-Medium)69	9
Goal-Conditioned Reinforcement Learning (Manipulation)	puzzle-3x3-play state-based v0 (test)	Success Rate14	6
Goal-Conditioned Reinforcement Learning (Manipulation)	scene-play state-based v0 (test)	Success Rate58	6

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord