Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ACE-Merging: Data-Free Model Merging with Adaptive Covariance Estimation

About

Model merging aims to combine multiple task-specific expert models into a single model while preserving generalization across diverse tasks. However, interference among experts, especially when they are trained on different objectives, often leads to significant performance degradation. Despite recent progress, resolving this interference without data access, retraining, or architectural modification remains a fundamental challenge. This paper provides a theoretical analysis demonstrating that the input covariance of each task, which is a key factor for optimal merging, can be implicitly estimated from the parameter differences of its fine-tuned model, even in a fully data-free setting. Building on this insight, we introduce \acem, an Adaptive Covariance Estimation framework that effectively mitigates inter-task interference. Our approach features a principled, closed-form solution that contrasts with prior iterative or heuristic methods. Extensive experiments on both vision and language benchmarks demonstrate that \acem sets a new state-of-the-art among data-free methods. It consistently outperforms existing baselines; for example, \acem achieves an average absolute improvement of 4\% over the previous methods across seven tasks on GPT-2. Owing to its efficient closed-form formulation, \acem delivers superior performance with a modest computational cost, providing a practical and theoretically grounded solution for model merging.

Bo Xu, Haotian Wu, Hehai Lin, Weiquan Huang, Beier Zhu, Yao Shu, Chengwei Qin• 2026

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval--
1036
Mathematical ReasoningMathQA
Accuracy34.57
305
Image Classification14 Vision Tasks
Average Accuracy91.1
84
Multi-task image classification8-task vision benchmark
Average Accuracy94.6
24
Multi-task image classificationVision Benchmark 20-task
Average Accuracy89.5
24
Natural Language UnderstandingGLUE
CoLA Score70.3
8
Mathematical Reasoninggsm8k cot
Accuracy94.26
7
Question Answeringxquad zh
Raw Score16.42
4
Question Answeringxquad vi
Score37.68
4
Code GenerationMathQA
Normalized Performance100.8
3
Showing 10 of 13 rows

Other info

Follow for update