Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SyMerge: From Non-Interference to Synergistic Merging via Single-Layer Adaptation

About

Model merging combines independently trained models into a single multi-task model. However, most existing approaches focus primarily on avoiding task interference. We argue that its greater potential lies in enabling task synergy, where tasks actively improve one another. We identify cross-task performance, defined by compatibility between encoders and predictors across tasks, as a key indicator of merge quality. We demonstrate that adapting only a single task-specific layer is sufficient to induce such synergy. This study proposes SyMerge, a lightweight framework that jointly optimizes merging coefficients and a single task-specific layer. We adopt an expert-guided self-labeling objective, providing stable supervision beyond entropy minimization. Intriguingly, we further show that SyMerge successfully merges models trained from different initializations, a regime where standard methods break down. Our minimalist yet principled method achieves state-of-the-art results across vision, dense prediction, and NLP benchmarks. Our code is available at https://aim-skku.github.io/SyMerge

Aecheon Jung, Seunghwan Lee, Dongyoon Han, Sungeun Hong• 2024

Related benchmarks

TaskDatasetResultRank
Depth EstimationNYU Depth V2--
209
Image Classification20 Vision Classification Tasks
Average Accuracy93.2
131
Image Classification14 Vision Tasks
Average Accuracy92.8
121
Surface Normal EstimationNYU V2
Mean Angular Error26.2
65
Image Classification8 vision tasks Average
Average Accuracy94.1
53
Semantic segmentationNYU V2
mIoU49.8
30
Natural Language UnderstandingGLUE
CoLA Score60
12
Multi-task Text GenerationGLUE 8 tasks
CoLA Score69.1
9
Showing 8 of 8 rows

Other info

Follow for update