Parametrized Deep Q-Networks Learning: Reinforcement Learning with Discrete-Continuous Hybrid Action Space
About
Most existing deep reinforcement learning (DRL) frameworks consider either discrete action space or continuous action space solely. Motivated by applications in computer games, we consider the scenario with discrete-continuous hybrid action space. To handle hybrid action space, previous works either approximate the hybrid space by discretization, or relax it into a continuous set. In this paper, we propose a parametrized deep Q-network (P- DQN) framework for the hybrid action space without approximation or relaxation. Our algorithm combines the spirits of both DQN (dealing with discrete action space) and DDPG (dealing with continuous action space) by seamlessly integrating them. Empirical results on a simulation example, scoring a goal in simulated RoboCup soccer and the solo mode in game King of Glory (KOG) validate the efficiency and effectiveness of our method.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Half Field Offense | Half Field Offense (HFO) (evaluation) | P(Goal)0.989 | 7 | |
| Air-to-Air Combat | Air-to-Air Combat Easy | Success Rate81.2 | 7 | |
| Air-to-Air Combat | Air-to-Air Combat Medium | Success Rate74.2 | 7 | |
| Air-to-Air Combat | Air-to-Air Combat Hard | Success Rate57.2 | 7 | |
| Maze Navigation | Maze Navigation Easy | Success Rate85.8 | 7 | |
| Maze Navigation | Maze Navigation Medium | Success Rate68.8 | 7 | |
| Maze Navigation | Maze Navigation Hard | Success Rate38.4 | 7 | |
| Platform Control | Platform (Evaluation) | Return96.4 | 5 | |
| Robot Soccer | Robot Soccer Goal (Evaluation) | P(Goal)70.1 | 5 |