Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Diversity-based Trajectory and Goal Selection with Hindsight Experience Replay

About

Hindsight experience replay (HER) is a goal relabelling technique typically used with off-policy deep reinforcement learning algorithms to solve goal-oriented tasks; it is well suited to robotic manipulation tasks that deliver only sparse rewards. In HER, both trajectories and transitions are sampled uniformly for training. However, not all of the agent's experiences contribute equally to training, and so naive uniform sampling may lead to inefficient learning. In this paper, we propose diversity-based trajectory and goal selection with HER (DTGSH). Firstly, trajectories are sampled according to the diversity of the goal states as modelled by determinantal point processes (DPPs). Secondly, transitions with diverse goal states are selected from the trajectories by using k-DPPs. We evaluate DTGSH on five challenging robotic manipulation tasks in simulated robot environments, where we show that our method can learn more quickly and reach higher performance than other state-of-the-art approaches on all tasks.

Tianhong Dai, Hengyan Liu, Kai Arulkumaran, Guangyu Ren, Anil Anthony Bharath• 2021

Related benchmarks

TaskDatasetResultRank
Robotic PushingFetchPush v1
Success Rate100
10
Robotic Hand ReachingHandReach v0
Success Rate62
10
Robotic Block ManipulationHandManipulateBlockFull v0
Success Rate7
10
Robotic Egg ManipulationHandManipulateEggFull v0
Success Rate29
10
Robotic Pen RotationHandManipulatePenRotate v0
Success Rate25
10
Robotic Pick-and-PlaceFetchPickAndPlace v1
Success Rate93
10
Robotic ManipulationFetchPush v1
Time-to-Threshold (Epochs)14
5
Robotic ManipulationHandReach v0
Cumulative Regret60.5
5
Robotic ManipulationHandManipulatePenRotate v0
Time to Threshold (Epochs)22
5
Robotic ManipulationFetchPickAndPlace v1
Time to Threshold (Epochs)40
5
Showing 10 of 12 rows

Other info

Follow for update