Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

BECAME: BayEsian Continual Learning with Adaptive Model MErging

About

Continual Learning (CL) strives to learn incrementally across tasks while mitigating catastrophic forgetting. A key challenge in CL is balancing stability (retaining prior knowledge) and plasticity (learning new tasks). While representative gradient projection methods ensure stability, they often limit plasticity. Model merging techniques offer promising solutions, but prior methods typically rely on empirical assumptions and carefully selected hyperparameters. In this paper, we explore the potential of model merging to enhance the stability-plasticity trade-off, providing theoretical insights that underscore its benefits. Specifically, we reformulate the merging mechanism using Bayesian continual learning principles and derive a closed-form solution for the optimal merging coefficient that adapts to the diverse characteristics of tasks. To validate our approach, we introduce a two-stage framework named BECAME, which synergizes the expertise of gradient projection and adaptive merging. Extensive experiments show that our approach outperforms state-of-the-art CL methods and existing merging strategies.

Mei Li, Yuxiang Lu, Qinyan Dai, Suizhi Huang, Yue Ding, Hongtao Lu• 2025

Related benchmarks

TaskDatasetResultRank
Class-incremental learningCIFAR100 10 Tasks
Accuracy75.2
66
Class-incremental learningImageNet-R 5-task--
64
Class-incremental learningCIFAR-100 20 tasks
Accuracy73.5
58
Continual LearningCIFAR-100
Task Forgetting (FGT_T)4.8
46
Continual LearningImageNet-R
Average Forgetting3.1
39
Class-incremental learningImageNet-R (20 tasks)
Accuracy (20 Tasks)78.7
32
Class-incremental learningCIFAR100 5 Tasks
Accuracy80.9
31
Class-incremental learningImageNet-R 10 tasks
Accuracy (10 Tasks)79.8
31
Image ClassificationImageNet-R 10 tasks
ACC1087.8
16
Image ClassificationImageNet-R 5 tasks
Accuracy87.9
10
Showing 10 of 13 rows

Other info

Follow for update