Preserving Linear Separability in Continual Learning by Backward Feature Projection

About

Catastrophic forgetting has been a major challenge in continual learning, where the model needs to learn new tasks with limited or no access to data from previously seen tasks. To tackle this challenge, methods based on knowledge distillation in feature space have been proposed and shown to reduce forgetting. However, most feature distillation methods directly constrain the new features to match the old ones, overlooking the need for plasticity. To achieve a better stability-plasticity trade-off, we propose Backward Feature Projection (BFP), a method for continual learning that allows the new features to change up to a learnable linear transformation of the old features. BFP preserves the linear separability of the old classes while allowing the emergence of new feature directions to accommodate new classes. BFP can be integrated with existing experience replay methods and boost performance by a significant margin. We also demonstrate that BFP helps learn a better representation space, in which linear separability is well preserved during continual learning and linear probing achieves high classification accuracy. The code can be found at https://github.com/rvl-lab-utoronto/BFP

Qiao Gu, Dongsub Shim, Florian Shkurti• 2023

Related benchmarks

Task	Dataset	Result
Continual Learning	CIFAR100 (test)	Mean Accuracy57.39	69
Continual Learning	CIFAR-10 (test)	Final Average Accuracy (FAA)73.51	31
Continual Learning	CIFAR-100 Split 10 sequential tasks (test)	Final Forgetting (FF)19.76	24
Continual Learning	TinyImageNet Split 10 sequential tasks (test)	Final Forgetting27.19	24
Continual Learning	CIFAR-10 Split 5 sequential tasks (test)	Final Forgetting (FF)16.81	24
Continual Learning	CIFAR-10	ECE9.4	15
Continual Learning	CIFAR-100	ECE9.28	15
Continual Learning	Tiny-ImageNet	ECE8.25	14

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord