Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DisCoRL: Continual Reinforcement Learning via Policy Distillation

About

In multi-task reinforcement learning there are two main challenges: at training time, the ability to learn different policies with a single model; at test time, inferring which of those policies applying without an external signal. In the case of continual reinforcement learning a third challenge arises: learning tasks sequentially without forgetting the previous ones. In this paper, we tackle these challenges by proposing DisCoRL, an approach combining state representation learning and policy distillation. We experiment on a sequence of three simulated 2D navigation tasks with a 3 wheel omni-directional robot. Moreover, we tested our approach's robustness by transferring the final policy into a real life setting. The policy can solve all tasks and automatically infer which one to run.

Ren\'e Traor\'e, Hugo Caselles-Dupr\'e, Timoth\'ee Lesort, Te Sun, Guanghang Cai, Natalia D\'iaz-Rodr\'iguez, David Filliat• 2019

Related benchmarks

TaskDatasetResultRank
Continual LearningCW10 (sequence)
Performance84
27
Continual Reinforcement LearningCW20 sequence
Performance83
11
Showing 2 of 2 rows

Other info

Follow for update