Boosting Continual Learning of Vision-Language Models via Mixture-of-Experts Adapters
About
Continual learning can empower vision-language models to continuously acquire new knowledge, without the need for access to the entire historical dataset. However, mitigating the performance degradation in large-scale models is non-trivial due to (i) parameter shifts throughout lifelong learning and (ii) significant computational burdens associated with full-model tuning. In this work, we present a parameter-efficient continual learning framework to alleviate long-term forgetting in incremental learning with vision-language models. Our approach involves the dynamic expansion of a pre-trained CLIP model, through the integration of Mixture-of-Experts (MoE) adapters in response to new tasks. To preserve the zero-shot recognition capability of vision-language models, we further introduce a Distribution Discriminative Auto-Selector (DDAS) that automatically routes in-distribution and out-of-distribution inputs to the MoE Adapter and the original CLIP, respectively. Through extensive experiments across various settings, our proposed method consistently outperforms previous state-of-the-art approaches while concurrently reducing parameter training burdens by 60%. Our code locates at https://github.com/JiazuoYu/MoE-Adapters4CL
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | Food101 | Accuracy82.9 | 457 | |
| Class-incremental learning | CIFAR-100 | Average Accuracy85.27 | 116 | |
| Continual Learning | CIFAR-100 | -- | 56 | |
| Class-incremental learning | ImageNet-R 10-task | -- | 54 | |
| Image Classification | ImageNet 1k (full) | Top-1 Acc66 | 53 | |
| Image Classification | ImageNet A | -- | 50 | |
| Domain-incremental learning | CORe50 | Avg Accuracy (A)94.9 | 49 | |
| Class-incremental learning | ImageNet-R 5-task | Avg Accuracy (A_bar)83.61 | 45 | |
| Class-incremental learning | VTAB B0 Inc10 | Last Accuracy66.25 | 38 | |
| Class-incremental learning | CIFAR100 10 Tasks | Accuracy84.75 | 36 |