Mitigating Forgetting in Continual Learning with Selective Gradient Projection

About

As neural networks are increasingly deployed in dynamic environments, they face the challenge of catastrophic forgetting, the tendency to overwrite previously learned knowledge when adapting to new tasks, resulting in severe performance degradation on earlier tasks. We propose Selective Forgetting-Aware Optimization (SFAO), a dynamic method that regulates gradient directions via cosine similarity and per-layer gating, enabling controlled forgetting while balancing plasticity and stability. SFAO selectively projects, accepts, or discards updates using a tunable mechanism with efficient Monte Carlo approximation. Experiments on standard continual learning benchmarks show that SFAO achieves competitive accuracy with markedly lower memory cost, a 90$\%$ reduction, and improved forgetting on MNIST datasets, making it suitable for resource-constrained scenarios.

Anika Singh, Aayush Dhaulakhandi, Varun Chopade, Likhith Malipati, David Martinez, Kevin Zhu• 2026

Related benchmarks

Task	Dataset	Result
Image Classification	MNIST Split	--	24
Image Classification	CIFAR-10 Split	--	12
Continual Learning	CIFAR-100 Split	--	10
Continual Learning	CIFAR-10 Split (test)	Mean BWT77	7
Image Classification	TinyImageNet Split	Task 1 Score24.4	5
Image Classification	Permuted MNIST p1, p2, p3 (test)	Task 1 Accuracy76	5
Image Classification	CIFAR-100 Split (test)	Task 1 Accuracy10.1	5

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord