Variational Quantum Rainbow Deep Q-Network for Optimizing Resource Allocation Problem

About

Resource allocation remains NP-hard due to combinatorial complexity. While deep reinforcement learning (DRL) methods, such as the Rainbow Deep Q-Network (DQN), improve scalability through prioritized replay and distributional heads, classical function approximators limit their representational power. We introduce Variational Quantum Rainbow DQN (VQR-DQN), which integrates ring-topology variational quantum circuits with Rainbow DQN to leverage quantum superposition and entanglement. We frame the human resource allocation problem (HRAP) as a Markov decision process (MDP) with combinatorial action spaces based on officer capabilities, event schedules, and transition times. On four HRAP benchmarks, VQR-DQN achieves 26.8% normalized makespan reduction versus random baselines and outperforms Double DQN and classical Rainbow DQN by 4.9-13.4%. These gains align with theoretical connections between circuit expressibility, entanglement, and policy quality, demonstrating the potential of quantum-enhanced DRL for large-scale resource allocation. Our implementation is available at: https://github.com/Analytics-Everywhere-Lab/qtrl/.

Truong Thanh Hung Nguyen, Truong Thinh Nguyen, Hung Cao• 2025

Related benchmarks

Task	Dataset	Result
Resource Allocation	HRAP 3O-2T-2E 1.0 (test)	Averaged Rewards-0.3823	4
Resource Allocation	HRAP 4O-3T-2E 1.0 (test)	Averaged Rewards-0.2815	4
Resource Allocation	HRAP 4O-3T-3E 1.0 (test)	Averaged Rewards-0.2872	4
Resource Allocation	HRAP 5O-4T-4E 1.0 (test)	Averaged Rewards-0.2236	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord