Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Mind the Interference: Retaining Pre-trained Knowledge in Parameter Efficient Continual Learning of Vision-Language Models

About

This study addresses the Domain-Class Incremental Learning problem, a realistic but challenging continual learning scenario where both the domain distribution and target classes vary across tasks. To handle these diverse tasks, pre-trained Vision-Language Models (VLMs) are introduced for their strong generalizability. However, this incurs a new problem: the knowledge encoded in the pre-trained VLMs may be disturbed when adapting to new tasks, compromising their inherent zero-shot ability. Existing methods tackle it by tuning VLMs with knowledge distillation on extra datasets, which demands heavy computation overhead. To address this problem efficiently, we propose the Distribution-aware Interference-free Knowledge Integration (DIKI) framework, retaining pre-trained knowledge of VLMs from a perspective of avoiding information interference. Specifically, we design a fully residual mechanism to infuse newly learned knowledge into a frozen backbone, while introducing minimal adverse impacts on pre-trained knowledge. Besides, this residual property enables our distribution-aware integration calibration scheme, explicitly controlling the information implantation process for test data from unseen distributions. Experiments demonstrate that our DIKI surpasses the current state-of-the-art approach using only 0.86% of the trained parameters and requiring substantially less training time. Code is available at: https://github.com/lloongx/DIKI .

Longxiang Tang, Zhuotao Tian, Kai Li, Chunming He, Hantao Zhou, Hengshuang Zhao, Xiu Li, Jiaya Jia• 2024

Related benchmarks

TaskDatasetResultRank
Multi-Task Incremental LearningMTIL Order II
Average Acc74.5
76
Multi-domain Task-Incremental LearningMTIL Order I (test)
Average Accuracy76.3
30
Multi-Task Incremental LearningMTIL
Average Accuracy76.4
20
Image ClassificationVDD
Average Accuracy65.9
20
Continual LearningVDD
Figure of Merit (FoM)370.8
18
Continual LearningMTIL
FoM6.4
18
Multi-domain Task-Incremental LearningMTIL Order I
Transfer Acc68.7
17
Multi-Domain Task-Incremental Learning (Transfer)Multi-Domain Task-Incremental Learning Sequence (Aircraft, Caltech101, CIFAR100, DTD, Flowers, Food, StanfordCars, SUN397) 16-shot (test)
Caltech101 Accuracy95.6
16
Image ClassificationMNIST MDCII Order-II
Transfer Accuracy89.5
15
Image ClassificationAircraft MDCII Order-II
Transfer Accuracy93.6
15
Showing 10 of 39 rows

Other info

Follow for update