BYOL-Explore: Exploration by Bootstrapped Prediction
About
We present BYOL-Explore, a conceptually simple yet general approach for curiosity-driven exploration in visually-complex environments. BYOL-Explore learns a world representation, the world dynamics, and an exploration policy all-together by optimizing a single prediction loss in the latent space with no additional auxiliary objective. We show that BYOL-Explore is effective in DM-HARD-8, a challenging partially-observable continuous-action hard-exploration benchmark with visually-rich 3-D environments. On this benchmark, we solve the majority of the tasks purely through augmenting the extrinsic reward with BYOL-Explore s intrinsic reward, whereas prior work could only get off the ground with human demonstrations. As further evidence of the generality of BYOL-Explore, we show that it achieves superhuman performance on the ten hardest exploration games in Atari while having a much simpler design than other competitive agents.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reinforcement Learning | Atari 2600 MONTEZUMA'S REVENGE | Score5.15e+3 | 45 | |
| Reinforcement Learning | Atari 2600 Qbert | Score2.00e+5 | 15 | |
| Reinforcement Learning | Atari 10 hard-exploration games (train) | Alien Score1.25e+5 | 10 | |
| Reinforcement Learning | Atari 2600 GRAVITAR | GRAVITAR Score796 | 10 | |
| Reinforcement Learning | Atari 2600 FREEWAY | Score12.94 | 9 | |
| Baseball | DM-HARD-8 | Max Agent Score9.94 | 5 | |
| Navigate Cubes | DM-HARD-8 | Max Agent Score10 | 5 | |
| Reinforcement Learning | Atari 10-hardest exploration games (test) | Mean CHNS100 | 5 | |
| Reinforcement Learning | Atari 10 hard-exploration games | Alien Score1.25e+5 | 5 | |
| Throw Across | DM-HARD-8 | Max Agent Score8.46 | 5 |