Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Fisher-Orthogonal Projected Natural Gradient Descent for Continual Learning

About

Continual learning aims to enable neural networks to acquire new knowledge on sequential tasks. However, the key challenge in such settings is to learn new tasks without catastrophically forgetting previously learned tasks. We propose the Fisher-Orthogonal Projected Natural Gradient Descent (FOPNG) optimizer, which enforces Fisher-orthogonal constraints on parameter updates to preserve old task performance while learning new tasks. Unlike existing methods that operate in Euclidean parameter space, FOPNG projects gradients onto the Fisher-orthogonal complement of previous task gradients. This approach unifies natural gradient descent with orthogonal gradient methods within an information-geometric framework. We provide theoretical analysis deriving the projected update, describe efficient and practical implementations using the diagonal Fisher, and demonstrate strong results on standard continual learning benchmarks such as Permuted-MNIST, Split-MNIST, Rotated-MNIST, Split-CIFAR10, and Split-CIFAR100. Our code is available at https://github.com/ishirgarg/FOPNG.

Ishir Garg, Neel Kolhe, Andy Peng, Rohan Gopalam• 2026

Related benchmarks

TaskDatasetResultRank
Continual Language ModelingHOPE 8-domain, clean-regime 256M (3 seeds)
Avg PPL36.8
19
Continual LearningHOPE 8-domain (test)
Forgetting10.3
9
Continual Language ModelingHOPE 8-domain, high-overlap non-adaptive stream 256M
Forgetting12.8
6
Continual Language ModelingHOPE 16-domain stream 256M (long continual sequence)
Forgetting14.7
4
Showing 4 of 4 rows

Other info

Follow for update