Prioritized Experience Replay
About
Experience replay lets online reinforcement learning agents remember and reuse experiences from the past. In prior work, experience transitions were uniformly sampled from a replay memory. However, this approach simply replays transitions at the same frequency that they were originally experienced, regardless of their significance. In this paper we develop a framework for prioritizing experience, so as to replay important transitions more frequently, and therefore learn more efficiently. We use prioritized experience replay in Deep Q-Networks (DQN), a reinforcement learning algorithm that achieved human-level performance across many Atari games. DQN with prioritized experience replay achieves a new state-of-the-art, outperforming DQN with uniform replay on 41 out of 49 games.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reinforcement Learning | Atari 2600 MONTEZUMA'S REVENGE | Score13 | 45 | |
| Atari Game Playing | Pitfall! | Score-15 | 25 | |
| Reinforcement Learning | Atari 2600 57 games | Median Human-Normalized Score140 | 20 | |
| Visual Reinforcement Learning | CARLA (#GP scenario) | ER51 | 15 | |
| Autonomous Driving | CARLA (#HW) | Error Rate159 | 15 | |
| Reinforcement Learning | Atari 2600 57 games (test) | Median Human-Normalized Score124 | 15 | |
| Atari Game Playing | Atari 2600 57 games human starts evaluation metric | Median Human-Normalized Score128 | 14 | |
| Game Playing | Atari 2600 (Arcade Learning Environment) v1 (test) | Alien Score900.5 | 13 | |
| Continual Reinforcement Learning | Meta-World MT50 v2 | AP68.7 | 11 | |
| Online Continual Self-Supervised Learning | CIFAR-100 streaming online 20 experiences | Final Accuracy48.5 | 9 |