Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Hindsight Experience Replay

About

Dealing with sparse rewards is one of the biggest challenges in Reinforcement Learning (RL). We present a novel technique called Hindsight Experience Replay which allows sample-efficient learning from rewards which are sparse and binary and therefore avoid the need for complicated reward engineering. It can be combined with an arbitrary off-policy RL algorithm and may be seen as a form of implicit curriculum. We demonstrate our approach on the task of manipulating objects with a robotic arm. In particular, we run experiments on three different tasks: pushing, sliding, and pick-and-place, in each case using only binary rewards indicating whether or not the task is completed. Our ablation studies show that Hindsight Experience Replay is a crucial ingredient which makes training possible in these challenging environments. We show that our policies trained on a physics simulation can be deployed on a physical robot and successfully complete the task.

Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, Pieter Abbeel, Wojciech Zaremba• 2017

Related benchmarks

TaskDatasetResultRank
Offline Goal-Conditioned Reinforcement LearningFetchReach (offline)
Discounted Return29.8
10
Visual PickupSkewFit
Goal Reaching Error (m)0.035
10
Offline Goal-Conditioned Reinforcement LearningFetchPick (offline)
Discounted Return16.8
10
Visual PusherSkewFit
Goal Reaching Error0.075
10
Offline Goal-Conditioned Reinforcement LearningHandReach (offline)
Discounted Return0.81
10
Offline Goal-Conditioned Reinforcement LearningFetchPush (offline)
Discounted Return12.5
10
Offline Goal-Conditioned Reinforcement LearningFetchSlide (offline)
Discounted Return1.08
10
Lane FollowingCARLA Town07 (unseen routes)
Success Rate0.4552
5
OvertakingCARLA Town04 (unseen routes)
Success Rate34
5
Offline Goal-Conditioned Reinforcement LearningD'ClawTurn (offline)
Discounted Return0.00e+0
5
Showing 10 of 13 rows

Other info

Follow for update