Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint

About

Fine-tuning pre-trained Large Language Models (LLMs) for specialized tasks incurs substantial computational and data costs. While model merging offers a training-free solution to integrate multiple task-specific models, existing methods suffer from safety-utility conflicts where enhanced general capabilities degrade safety safeguards. We identify two root causes: $\textbf{neuron misidentification}$ due to simplistic parameter magnitude-based selection, and $\textbf{cross-task neuron interference}$ during merging. To address these challenges, we propose $\textbf{LED-Merging}$, a three-stage framework that $\textbf{L}$ocates task-specific neurons via gradient-based attribution, dynamically $\textbf{E}$lects critical neurons through multi-model importance fusion, and $\textbf{D}$isjoints conflicting updates through parameter isolation. Extensive experiments on Llama-3-8B, Mistral-7B, and Llama2-13B demonstrate that LED-Merging effectively reduces harmful response rates, showing a 31.4\% decrease on Llama-3-8B-Instruct on HarmBench, while simultaneously preserving 95\% of utility performance, such as achieving 52.39\% accuracy on GSM8K. LED-Merging resolves safety-utility conflicts and provides a lightweight, training-free paradigm for constructing reliable multi-task LLMs. Code is available at $\href{https://github.com/MqLeet/LED-Merging}{GitHub}$.

Qianli Ma, Dongrui Liu, Qian Chen, Linfeng Zhang, Jing Shao• 2025

Related benchmarks

TaskDatasetResultRank
Mathematical ReasoningGSM8K
Accuracy52.39
983
Mathematical ReasoningMATH
Accuracy16.12
643
Mathematical ReasoningAIME
AIME Accuracy26.67
283
Science Question AnsweringARC Challenge
Accuracy34.58
234
Code GenerationHumanEval
Pass@145.12
108
Science Question AnsweringARC Easy
Accuracy35.1
101
Safety AlignmentHarmBench
ASR4
88
Code GeneratingMBPP
Pass@147.2
88
Code GenerationLiveCodeBench
Pass@119.27
86
ReasoningGSM8K--
83
Showing 10 of 46 rows

Other info

Follow for update