Quantum Hierarchical Reinforcement Learning via Variational Quantum Circuits

About

Reinforcement learning is one of the most challenging learning paradigms where efficacy and efficiency gains are extremely valuable. Hierarchical reinforcement learning is a variant that leverages temporal abstraction to structure decision-making. While parametrized quantum computations have shown success in non-hierarchical reinforcement learning, whether these advantages adapt to hierarchical decision-making remains a critical open question. In this work, we develop a hybrid hierarchical agent based on the option-critic architecture. This hybrid agent substitutes classical components with variational quantum circuits for feature extractors, option-value functions, termination functions, and intra-option policies. Evaluated on standard benchmarking environments, results show that a hybrid agent utilizing a quantum feature extractor can outperform classical baselines while saving up to 66\% trainable parameters. We also identify an architectural bottleneck that quantum option-value estimation severely degrades performance. Further ablation studies reveal how architectural choices of the quantum circuits affect performance. Our work establishes design principles for parameter-efficient hybrid hierarchical agents.

Yu-Ting Lee, Samuel Yen-Chi Chen, Fu-Chieh Chang• 2026

Related benchmarks

Task	Dataset	Result	Rank
Reinforcement Learning	Acrobot v1	Mean Return-125.7		42

Showing 1 of 1 rows

Other info

Follow for update

@wizwand_team Discord