Don't Forget the Critic: Value-Based Data Rehearsal for Multi-Cyclic Continual Reinforcement Learning

About

Data rehearsal has emerged as a leading approach for mitigating catastrophic forgetting in Continual Reinforcement Learning (CRL). However, existing work remains confined to policy gradient frameworks, regularizing only actors due to the performance degradation incurred by critic regularization. This actor-centric approach overlooks the potential of data rehearsal for value function approximation. Moreover, existing evaluations in CRL rarely consider multi-cyclic environments where task sequences repeat, a critical real-world scenario that exacerbates forgetting and plasticity. We investigate data rehearsal for Deep Q-Networks using Q-value regularization in multi-cyclic settings and propose Qreg+NWLU which introduces two simple modifications: (1) continuous data rehearsal that dynamically collects and updates stored Q-values throughout training, and (2) "No-Wait" regularization that applies immediately rather than after the first task. Together, these modifications yield improvements in learning efficiency, forgetting mitigation, and knowledge transfer over Qreg and conventional CRL methods within value function approximation settings.

Benjamin Poole, Andrew Quinn, Li Yang, Minwoo Lee• 2026

Related benchmarks

Task	Dataset	Result
Catcher	Catcher Task 1	Worst Transfer Grand Average (W_bar)0.06	9
Catching	Catcher Task 1 G.1.2	Return grand average G¯500.5	9
Catching	Catcher Task 2 G.1.2	Return G¯ (Avg)512.3	9
Continual Reinforcement Learning	Room Task 2	Worst Transfer Average (W-bar)0.08	9
Continual Reinforcement Learning	Room Task 3	Worst Transfer Average (W-bar)0.2	9
Continual Reinforcement Learning	Flappy Task 2	Worst Transfer Grand Average (W_bar)-0.12	9
Reinforcement Learning	Flappy Task 1	Grand Average Return (G)94.53	9
Reinforcement Learning	Flappy Task 2	Grand Average Return (G)54.31	9
Reinforcement Learning	Flappy Task 3	Grand Average Return32.29	9
Reinforcement Learning	Flappy Task 4	Grand Average Return18.01	9

Showing 10 of 34 rows

Other info

Follow for update

@wizwand_team Discord