| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Reinforcement Learning | Atari 2600 Games Breakdown | Avg Reward (baseline)1,399,753 | 52 | |
| Reinforcement Learning | Atari 2600 MONTEZUMA'S REVENGE | Score18,003,200 | 45 | |
| Reinforcement Learning | Atari 2600 Montezuma's Revenge ALE (test) | Score14,973 | 24 | |
| Reinforcement Learning | Atari 2600 57 games | Median Human-Normalized Score223 | 20 | |
| Reinforcement Learning | Atari 2600 Private Eye ALE (test) | Score17,313 | 19 | |
| Reinforcement Learning | Atari 2600 Qbert | Score684,700 | 15 | |
| Reinforcement Learning | Atari 2600 57 games (test) | Median Human-Normalized Score1,006.4 | 15 | |
| Reinforcement Learning | Atari 2600 Freeway ALE (test) | Score34 | 14 | |
| Atari game playing | Atari 2600 57 games human starts evaluation metric | Median Human-Normalized Score162 | 14 | |
| Reinforcement Learning | Atari 2600 Arcade Learning Environment (evaluation) | Montezuma's Revenge Score3,459 | 11 | |
| Reinforcement Learning | Atari 2600 GRAVITAR | GRAVITAR Score3,351.4 | 10 | |
| Reinforcement Learning | Atari 2600 (test) | Alien Score4,704 | 10 | |
| Reinforcement Learning | Atari 2600 (test) | Alien6,875 | 10 | |
| Reinforcement Learning | Atari 2600 | Alien Score7,128 | 9 | |
| Reinforcement Learning | Atari 2600 FREEWAY | Score32.4 | 9 | |
| Atari Game Playing | Atari 2600 ALE (test) | Freeway Score34 | 8 | |
| Reinforcement Learning | Atari 2600 Solaris | Average Score2,279.4 | 8 | |
| Reinforcement Learning | Atari 2600 | Asterix Score242 | 7 | |
| Reinforcement Learning | Atari 2600 25 games | Mean Human Normalized Score518.2 | 7 | |
| Reinforcement Learning | Atari 2600 55 games (test) | Mean Human-Normalized Score1,426 | 7 | |
| Reinforcement Learning | Atari 2600 SpaceInvaders | Best Stable Score1,400 | 7 | |
| Reinforcement Learning | Atari 2600 Seaquest | Average Score4,770 | 7 | |
| Atari Games Playing | Atari 2600 60 games (test) | Human Norm Mean (%)563 | 6 | |
| Reinforcement Learning | Atari 2600 TimePilot | Score7,340 | 6 | |
| Reinforcement Learning | Atari 2600 RND-dominating games (test) | PPO-normalized Mean Score435.56 | 5 |