Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Addressing Loss of Plasticity and Catastrophic Forgetting in Continual Learning

About

Deep representation learning methods struggle with continual learning, suffering from both catastrophic forgetting of useful units and loss of plasticity, often due to rigid and unuseful units. While many methods address these two issues separately, only a few currently deal with both simultaneously. In this paper, we introduce Utility-based Perturbed Gradient Descent (UPGD) as a novel approach for the continual learning of representations. UPGD combines gradient updates with perturbations, where it applies smaller modifications to more useful units, protecting them from forgetting, and larger modifications to less useful units, rejuvenating their plasticity. We use a challenging streaming learning setup where continual learning problems have hundreds of non-stationarities and unknown task boundaries. We show that many existing methods suffer from at least one of the issues, predominantly manifested by their decreasing accuracy over tasks. On the other hand, UPGD continues to improve performance and surpasses or is competitive with all methods in all problems. Finally, in extended reinforcement learning experiments with PPO, we show that while Adam exhibits a performance drop after initial learning, UPGD avoids it by addressing both continual learning issues.

Mohamed Elsayed, A. Rupam Mahmood• 2024

Related benchmarks

TaskDatasetResultRank
Class-incremental learningCIFAR-100 20 tasks--
58
Task-Incremental LearningTiny-ImageNet 20 tasks
Average Accuracy59.7
54
Task-Incremental LearningCIFAR-100 10 tasks
Backward Transfer-3.1
44
Image ClassificationCIFAR10-C
Mean Accuracy (mAcc)53.9
41
ClassificationCovertype
Accuracy23.4
40
Image ClassificationTinyImageNet-C
Accuracy19.6
30
Wafer Map Defect ClassificationWM811K
Macro F174.6
28
Task-Incremental LearningCIFAR-100 (20-split)
Accuracy78
27
Medical Image ClassificationCamelyon17
Accuracy88.7
19
Continual Learning EvaluationSix Benchmark Datasets Overall
Average Accuracy48.2
19
Showing 10 of 17 rows

Other info

Follow for update