Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Distributed Prioritized Experience Replay

About

We propose a distributed architecture for deep reinforcement learning at scale, that enables agents to learn effectively from orders of magnitude more data than previously possible. The algorithm decouples acting from learning: the actors interact with their own instances of the environment by selecting actions according to a shared neural network, and accumulate the resulting experience in a shared experience replay memory; the learner replays samples of experience and updates the neural network. The architecture relies on prioritized experience replay to focus only on the most significant data generated by the actors. Our architecture substantially improves the state of the art on the Arcade Learning Environment, achieving better final performance in a fraction of the wall-clock training time.

Dan Horgan, John Quan, David Budden, Gabriel Barth-Maron, Matteo Hessel, Hado van Hasselt, David Silver• 2018

Related benchmarks

TaskDatasetResultRank
Reinforcement LearningAtari 2600 MONTEZUMA'S REVENGE
Score2.50e+3
45
Atari Game PlayingPitfall!
Score-1
25
Reinforcement LearningAtari 57
Atlantis8.32e+5
21
Reinforcement LearningAtari 2600 57 games (test)--
15
Reinforcement LearningAtari large data setting
Median Human-Normalized Score434.1
3
Showing 5 of 5 rows

Other info

Follow for update