Provable Contrastive Continual Learning

About

Continual learning requires learning incremental tasks with dynamic data distributions. So far, it has been observed that employing a combination of contrastive loss and distillation loss for training in continual learning yields strong performance. To the best of our knowledge, however, this contrastive continual learning framework lacks convincing theoretical explanations. In this work, we fill this gap by establishing theoretical performance guarantees, which reveal how the performance of the model is bounded by training losses of previous tasks in the contrastive continual learning framework. Our theoretical explanations further support the idea that pre-training can benefit continual learning. Inspired by our theoretical analysis of these guarantees, we propose a novel contrastive continual learning algorithm called CILA, which uses adaptive distillation coefficients for different tasks. These distillation coefficients are easily computed by the ratio between average distillation losses and average contrastive losses from previous tasks. Our method shows great improvement on standard benchmarks and achieves new state-of-the-art performance.

Yichen Wen, Zhiquan Tan, Kaipeng Zheng, Chuanlong Xie, Weiran Huang• 2024

Related benchmarks

Task	Dataset	Result
Class-incremental learning	CIFAR-10 Seq	Final Average Accuracy (FAA)76.03	53
Image Classification	CIFAR-10 Seq	Final Average Accuracy93.29	52
Image Classification	Seq-CIFAR-100	Accuracy68.29	52
Image Classification	Seq-Tiny-ImageNet	Final Average Accuracy49.19	44
Task-Incremental Learning	Seq-CIFAR-10	FAA93.29	28
Task-Incremental Learning	CIFAR-100 Seq	FAA68.29	28
Task-Incremental Learning	Tiny ImageNet Seq	FF27.83	27
Task-Incremental Learning	CIFAR-10 Seq	Average Accuracy96.4	25
Class-incremental learning	TinyImageNet Seq	Average Accuracy20.64	25
Task-Incremental Learning	Seq-Tiny-ImageNet	Average Accuracy54.13	25

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord