Toward a Holistic Approach to Continual Model Merging
About
We present a holistic framework for Continual Model Merging (CMM) that intervenes at three critical stages: pre-merging, during merging, and post-merging-to address two fundamental challenges in continual learning. In particular, conventional approaches either maintain a growing list of per-domain task vectors, leading to scalability issues or rely solely on weight-space merging when old data is inaccessible, thereby losing crucial functional information. Our method overcomes these limitations by first fine-tuning the main model within its tangent space on domain-specific data; this linearization amplifies per-task weight disentanglement, effectively mitigating across-task interference. During merging, we leverage functional information from available optimizer states beyond mere parameter averages to avoid the need to revisit old data. Finally, a post-merging correction aligns the representation discrepancy between pre- and post-merged models, reducing bias and enhancing overall performance-all while operating under constant memory constraints without accessing historical data. Extensive experiments on standard class-incremental and domain-incremental benchmarks demonstrate that our approach not only achieves competitive performance but also provides a scalable and efficient solution to the catastrophic forgetting problem.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Continual Learning | CIFAR100 (test) | Mean Accuracy80.21 | 69 | |
| Class-incremental learning | CIFAR100 10 Tasks | Accuracy77.72 | 66 | |
| Class-incremental learning | ImageNet-R 5-task | -- | 64 | |
| Class-incremental learning | CUB200 10 Tasks | -- | 59 | |
| Class-incremental learning | CIFAR-100 20 tasks | Accuracy75.95 | 58 | |
| Class-incremental learning | Stanford Cars CIL, T=10 (test) | Avg Accuracy69.7 | 33 | |
| Class-incremental learning | ImageNet-R (20 tasks) | Accuracy (20 Tasks)82 | 32 | |
| Class-incremental learning | ImageNet-R 10 tasks | Accuracy (10 Tasks)82.69 | 31 | |
| Domain-incremental learning | Office-Home | -- | 22 | |
| Continual Learning | ImageNet-R (test) | Accuracy82.69 | 20 |