Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Overcoming catastrophic forgetting with hard attention to the task

About

Catastrophic forgetting occurs when a neural network loses the information learned in a previous task after training on subsequent tasks. This problem remains a hurdle for artificial intelligence systems with sequential learning capabilities. In this paper, we propose a task-based hard attention mechanism that preserves previous tasks' information without affecting the current task's learning. A hard attention mask is learned concurrently to every task, through stochastic gradient descent, and previous masks are exploited to condition such learning. We show that the proposed mechanism is effective for reducing catastrophic forgetting, cutting current rates by 45 to 80%. We also show that it is robust to different hyperparameter choices, and that it offers a number of monitoring capabilities. The approach features the possibility to control both the stability and compactness of the learned knowledge, which we believe makes it also attractive for online learning or network compression applications.

Joan Serr\`a, D\'idac Sur\'is, Marius Miron, Alexandros Karatzoglou• 2018

Related benchmarks

TaskDatasetResultRank
Image ClassificationMNIST (test)
Accuracy99.4
894
Image ClassificationCIFAR-100
Accuracy79.52
691
Image ClassificationFashion MNIST (test)
Accuracy91.9
592
Image ClassificationCIFAR-10
Accuracy96.4
564
Image ClassificationMNIST
Accuracy99.7
417
Image ClassificationSVHN (test)
Accuracy93.8
401
Image ClassificationSVHN
Accuracy96.4
395
ClassificationCars
Accuracy73.22
395
Image ClassificationCUB
Accuracy79.67
282
Image ClassificationCIFAR100 (test)
Test Accuracy49.1
147
Showing 10 of 69 rows

Other info

Code

Follow for update