Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Preserving Linear Separability in Continual Learning by Backward Feature Projection

About

Catastrophic forgetting has been a major challenge in continual learning, where the model needs to learn new tasks with limited or no access to data from previously seen tasks. To tackle this challenge, methods based on knowledge distillation in feature space have been proposed and shown to reduce forgetting. However, most feature distillation methods directly constrain the new features to match the old ones, overlooking the need for plasticity. To achieve a better stability-plasticity trade-off, we propose Backward Feature Projection (BFP), a method for continual learning that allows the new features to change up to a learnable linear transformation of the old features. BFP preserves the linear separability of the old classes while allowing the emergence of new feature directions to accommodate new classes. BFP can be integrated with existing experience replay methods and boost performance by a significant margin. We also demonstrate that BFP helps learn a better representation space, in which linear separability is well preserved during continual learning and linear probing achieves high classification accuracy. The code can be found at https://github.com/rvl-lab-utoronto/BFP

Qiao Gu, Dongsub Shim, Florian Shkurti• 2023

Related benchmarks

TaskDatasetResultRank
Continual LearningCIFAR100 (test)
Mean Accuracy57.39
62
Continual LearningCIFAR-10 (test)
Final Average Accuracy (FAA)73.51
31
Continual LearningCIFAR-100 Split 10 sequential tasks (test)
Final Forgetting (FF)19.76
24
Continual LearningTinyImageNet Split 10 sequential tasks (test)
Final Forgetting27.19
24
Continual LearningCIFAR-10 Split 5 sequential tasks (test)
Final Forgetting (FF)16.81
24
Continual LearningCIFAR-10
ECE9.4
15
Continual LearningCIFAR-100
ECE9.28
15
Continual LearningTiny-ImageNet
ECE8.25
14
Showing 8 of 8 rows

Other info

Follow for update