Hindsight Experience Replay

About

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task.

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba• 2017

Related benchmarks

Task	Dataset	Result
Continual Reinforcement Learning	Meta-World MT50 v2	AP71.2	11
Offline Goal-Conditioned Reinforcement Learning	FetchReach (offline)	Discounted Return29.8	10
Robotic Block Manipulation	HandManipulateBlockFull v0	Success Rate4	10
Robotic Egg Manipulation	HandManipulateEggFull v0	Success Rate22	10
Robotic Hand Reaching	HandReach v0	Success Rate54	10
Robotic Pushing	FetchPush v1	Success Rate99	10
Visual Pickup	SkewFit	Goal Reaching Error (m)0.035	10
Offline Goal-Conditioned Reinforcement Learning	FetchPick (offline)	Discounted Return16.8	10
Robotic Pen Rotation	HandManipulatePenRotate v0	Success Rate18	10
Robotic Pick-and-Place	FetchPickAndPlace v1	Success Rate88	10

Showing 10 of 37 rows

Other info

Follow for update

@wizwand_team Discord