Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Dynamic Model Merging Made Slim

About

Model merging enables the reuse of fine-tuned models without joint training or access to original data. Dynamic merging further improves flexibility by selectively activating task-relevant parameters and efficiently composing experts across multiple tasks. However, existing dynamic methods either maintain a full shared model with tiny experts or allocate excessive capacity to experts, leading to suboptimal accuracy--efficiency trade-offs. To address this, we propose DiDi-Merging, a slim dynamic merging framework that leverages differentiable rank allocation to balance shared and expert parameters. By formulating parameter budgeting as differentiable rank optimization in low-rank modules and introducing a data-free refinement step to recover task fidelity, DiDi-Merging matches prior dynamic baselines at only 1.24x the parameters of a single fine-tuned model and surpasses them at 1.4x, substantially more compact than methods requiring > 2x storage. DiDi-Merging applies across vision, language, and multimodal tasks.

Guodong Du, Wanyu Lin• 2026

Related benchmarks

TaskDatasetResultRank
Visual Question AnsweringVizWiz
Accuracy53.12
1820
Image ClassificationSVHN (test)
Accuracy96.5
470
Visual Question AnsweringScienceQA
Accuracy72.32
446
Image ClassificationDTD (test)
Accuracy76.2
316
Instruction FollowingMT-Bench
MT-Bench Score8.7
287
Visual Question AnsweringOK-VQA
Accuracy31.29
272
Image ClassificationSUN397 (test)
Top-1 Accuracy77.3
231
Image ClassificationEuroSAT (test)
Accuracy99.5
177
Visual Question AnsweringGQA
Accuracy62.08
155
CodingMBPP
Accuracy83.3
145
Showing 10 of 40 rows

Other info

Follow for update