FDQN: A Flexible Deep Q-Network Framework for Game Automation
About
In reinforcement learning, it is often difficult to automate high-dimensional, rapid decision-making in dynamic environments, especially when domains require real-time online interaction and adaptive strategies such as web-based games. This work proposes a state-of-the-art Flexible Deep Q-Network (FDQN) framework that can address this challenge with a selfadaptive approach that is processing high-dimensional sensory data in realtime using a CNN and dynamically adapting the model architecture to varying action spaces of different gaming environments and outperforming previous baseline models in various Atari games and the Chrome Dino game as baselines. Using the epsilon-greedy policy, it effectively balances the new learning and exploitation for improved performance, and it has been designed with a modular structure that it can be easily adapted to other HTML-based games without touching the core part of the framework. It is demonstrated that the FDQN framework can successfully solve a well-defined task in a laboratory condition, but more importantly it also discusses potential applications to more challenging real-world cases and serve as the starting point for future further exploration into automated game play and beyond.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Reinforcement Learning | MountainCar | Avg Episode Reward-107 | 14 | |
| Reinforcement Learning | Atari 2600 Frostbite ALE (test) | Avg Reward1.02e+3 | 13 | |
| Reinforcement Learning | Atari Breakout | Mean Return297 | 11 | |
| Reinforcement Learning | Atari Space Invaders | Mean Episode Return795 | 11 | |
| Reinforcement Learning | cartpole | Average Reward198 | 9 | |
| Reinforcement Learning | Atari 2600 assault ALE (test) | Final Score1.48e+3 | 4 | |
| Reinforcement Learning | Chrome Dino | Average Reward728 | 4 | |
| Reinforcement Learning | Pong | Average Reward18 | 4 | |
| Reinforcement Learning | Pacman | Average Reward6.15e+3 | 4 | |
| Reinforcement Learning | Qbert | Average Reward5.18e+3 | 4 |