Variational Quantum Rainbow Deep Q-Network for Optimizing Resource Allocation Problem
About
Resource allocation remains NP-hard due to combinatorial complexity. While deep reinforcement learning (DRL) methods, such as the Rainbow Deep Q-Network (DQN), improve scalability through prioritized replay and distributional heads, classical function approximators limit their representational power. We introduce Variational Quantum Rainbow DQN (VQR-DQN), which integrates ring-topology variational quantum circuits with Rainbow DQN to leverage quantum superposition and entanglement. We frame the human resource allocation problem (HRAP) as a Markov decision process (MDP) with combinatorial action spaces based on officer capabilities, event schedules, and transition times. On four HRAP benchmarks, VQR-DQN achieves 26.8% normalized makespan reduction versus random baselines and outperforms Double DQN and classical Rainbow DQN by 4.9-13.4%. These gains align with theoretical connections between circuit expressibility, entanglement, and policy quality, demonstrating the potential of quantum-enhanced DRL for large-scale resource allocation. Our implementation is available at: https://github.com/Analytics-Everywhere-Lab/qtrl/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Resource Allocation | HRAP 3O-2T-2E 1.0 (test) | Averaged Rewards-0.3823 | 4 | |
| Resource Allocation | HRAP 4O-3T-2E 1.0 (test) | Averaged Rewards-0.2815 | 4 | |
| Resource Allocation | HRAP 4O-3T-3E 1.0 (test) | Averaged Rewards-0.2872 | 4 | |
| Resource Allocation | HRAP 5O-4T-4E 1.0 (test) | Averaged Rewards-0.2236 | 4 |