SyMerge: From Non-Interference to Synergistic Merging via Single-Layer Adaptation

About

Model merging combines independently trained models into a single multi-task model. However, most existing approaches focus primarily on avoiding task interference. We argue that its greater potential lies in enabling task synergy, where tasks actively improve one another. We identify cross-task performance, defined by compatibility between encoders and predictors across tasks, as a key indicator of merge quality. We demonstrate that adapting only a single task-specific layer is sufficient to induce such synergy. This study proposes SyMerge, a lightweight framework that jointly optimizes merging coefficients and a single task-specific layer. We adopt an expert-guided self-labeling objective, providing stable supervision beyond entropy minimization. Intriguingly, we further show that SyMerge successfully merges models trained from different initializations, a regime where standard methods break down. Our minimalist yet principled method achieves state-of-the-art results across vision, dense prediction, and NLP benchmarks. Our code is available at https://aim-skku.github.io/SyMerge

Aecheon Jung, Seunghwan Lee, Dongyoon Han, Sungeun Hong• 2024

Related benchmarks

Task	Dataset	Result
Depth Estimation	NYU Depth V2	--	226
Image Classification	20 Vision Classification Tasks	Average Accuracy93.2	170
Image Classification	14 Vision Tasks	Average Accuracy92.8	160
Surface Normal Estimation	NYU V2	Mean Angular Error26.2	96
Image Classification	8 vision tasks Average	Average Accuracy94.1	53
Semantic segmentation	NYU V2	mIoU49.8	30
Natural Language Understanding	GLUE	CoLA Score60	12
Multi-task Text Generation	GLUE 8 tasks	CoLA Score69.1	9

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord