CAT Merging: A Training-Free Approach for Resolving Conflicts in Model Merging
About
Multi-task model merging offers a promising paradigm for integrating multiple expert models into a unified model without additional training. Existing state-of-the-art techniques, such as Task Arithmetic and its variants, merge models by accumulating task vectors -- the parameter differences between pretrained and finetuned models. However, task vector accumulation is often hindered by knowledge conflicts, leading to performance degradation. To address this challenge, we propose Conflict-Aware Task Merging (CAT Merging), a novel training-free framework that selectively trims conflict-prone components from the task vectors. CAT Merging introduces several parameter-specific strategies, including projection for linear weights and masking for scaling and shifting parameters in normalization layers. Extensive experiments on vision, language, and vision-language tasks demonstrate that CAT Merging effectively suppresses knowledge conflicts, achieving average accuracy improvements of up to 2.5% (ViT-B/32) and 2.0% (ViT-L/14) over state-of-the-art methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | SVHN (test) | Accuracy85.5 | 470 | |
| Image Classification | DTD (test) | Accuracy60.7 | 316 | |
| Image Classification | SUN397 (test) | Top-1 Accuracy68.1 | 231 | |
| Image Classification | EuroSAT (test) | Accuracy89.5 | 177 | |
| Image Classification | MNIST (test) | Accuracy98.6 | 138 | |
| Image Classification | GTSRB (test) | Accuracy (Clean)78.5 | 94 | |
| Image Classification | Cars (test) | Accuracy65.4 | 73 | |
| Image Classification | Resisc45 (test) | -- | 62 | |
| Image Classification | 8 vision tasks Average | Average Accuracy78.3 | 53 |