Dynamic Model Merging Made Slim
About
Model merging enables the reuse of fine-tuned models without joint training or access to original data. Dynamic merging further improves flexibility by selectively activating task-relevant parameters and efficiently composing experts across multiple tasks. However, existing dynamic methods either maintain a full shared model with tiny experts or allocate excessive capacity to experts, leading to suboptimal accuracy--efficiency trade-offs. To address this, we propose DiDi-Merging, a slim dynamic merging framework that leverages differentiable rank allocation to balance shared and expert parameters. By formulating parameter budgeting as differentiable rank optimization in low-rank modules and introducing a data-free refinement step to recover task fidelity, DiDi-Merging matches prior dynamic baselines at only 1.24x the parameters of a single fine-tuned model and surpasses them at 1.4x, substantially more compact than methods requiring > 2x storage. DiDi-Merging applies across vision, language, and multimodal tasks.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Visual Question Answering | VizWiz | Accuracy53.12 | 1820 | |
| Image Classification | SVHN (test) | Accuracy96.5 | 470 | |
| Visual Question Answering | ScienceQA | Accuracy72.32 | 446 | |
| Image Classification | DTD (test) | Accuracy76.2 | 316 | |
| Instruction Following | MT-Bench | MT-Bench Score8.7 | 287 | |
| Visual Question Answering | OK-VQA | Accuracy31.29 | 272 | |
| Image Classification | SUN397 (test) | Top-1 Accuracy77.3 | 231 | |
| Image Classification | EuroSAT (test) | Accuracy99.5 | 177 | |
| Visual Question Answering | GQA | Accuracy62.08 | 155 | |
| Coding | MBPP | Accuracy83.3 | 145 |