MagMax: Leveraging Model Merging for Seamless Continual Learning

About

This paper introduces a continual learning approach named MagMax, which utilizes model merging to enable large pre-trained models to continuously learn from new data without forgetting previously acquired knowledge. Distinct from traditional continual learning methods that aim to reduce forgetting during task training, MagMax combines sequential fine-tuning with a maximum magnitude weight selection for effective knowledge integration across tasks. Our initial contribution is an extensive examination of model merging techniques, revealing that simple approaches like weight averaging and random weight selection surprisingly hold up well in various continual learning contexts. More importantly, we present MagMax, a novel model-merging strategy that enables continual learning of large pre-trained models for successive tasks. Our thorough evaluation demonstrates the superiority of MagMax in various scenarios, including class- and domain-incremental learning settings. The code is available at this URL: https://github.com/danielm1405/magmax.

Daniel Marczak, Bart{\l}omiej Twardowski, Tomasz Trzci\'nski, Sebastian Cygert• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	TinyImageNet (test)	Accuracy75.98	499
Image Classification	Stanford Cars (test)	Accuracy88.61	320
Image Classification	CUB-200-2011 (test)	Top-1 Acc70.95	303
Semantic segmentation	Cityscapes (val)	mIoU70.04	301
Domain Generalization	PACS	Accuracy98.62	263
Domain Generalization	OfficeHome	Accuracy91.41	234
Depth Estimation	NYU Depth V2	--	209
Image Classification	Oxford Flowers-102 (test)	Top-1 Accuracy86.51	200
Domain Generalization	DomainNet	Accuracy62.24	153
Domain Generalization	TerraIncognita	Accuracy55.16	121

Showing 10 of 66 rows

Other info

Follow for update

@wizwand_team Discord