Beyond Not-Forgetting: Continual Learning with Backward Knowledge Transfer

About

By learning a sequence of tasks continually, an agent in continual learning (CL) can improve the learning performance of both a new task and `old' tasks by leveraging the forward knowledge transfer and the backward knowledge transfer, respectively. However, most existing CL methods focus on addressing catastrophic forgetting in neural networks by minimizing the modification of the learnt model for old tasks. This inevitably limits the backward knowledge transfer from the new task to the old tasks, because judicious model updates could possibly improve the learning performance of the old tasks as well. To tackle this problem, we first theoretically analyze the conditions under which updating the learnt model of old tasks could be beneficial for CL and also lead to backward knowledge transfer, based on the gradient projection onto the input subspaces of old tasks. Building on the theoretical analysis, we next develop a ContinUal learning method with Backward knowlEdge tRansfer (CUBER), for a fixed capacity neural network without data replay. In particular, CUBER first characterizes the task correlation to identify the positively correlated old tasks in a layer-wise manner, and then selectively modifies the learnt model of the old tasks when learning the new task. Experimental studies show that CUBER can even achieve positive backward knowledge transfer on several existing CL benchmarks for the first time without data replay, where the related baselines still suffer from catastrophic forgetting (negative backward knowledge transfer). The superior performance of CUBER on the backward knowledge transfer also leads to higher accuracy accordingly.

Sen Lin, Li Yang, Deliang Fan, Junshan Zhang• 2022

Related benchmarks

Task	Dataset	Result
Continual Learning	CIFAR-100 (10-split)	ACC75.54	54
Continual Image Classification	MiniImageNet Split	Accuracy64.25	42
Continual Image Classification	CIFAR100 Split	Accuracy75.54	30
Continual Image Classification	5-Datasets	Accuracy (%)93.48	23
Continual Learning	OL-CIFAR100 (Tasks 0-6)	Accuracy (%)75.01	23
Continual Learning	MNIST permuted	AT97.25	19
Continual Learning	5-dataset	Accuracy93.48	16
Lifelong Learning	Split miniImageNet (test)	Accuracy62.67	15
Lifelong Learning	5-dataset (test)	Accuracy93.48	15
Continual Learning	Permuted-MNIST (P-MNIST) (test)	Accuracy97.25	11

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord