Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

ResMerge: Residual-based Spectral Merging of Large Language Models

About

Model merging offers a training-free way to combine multiple post-trained expert models, but merging experts obtained through reinforcement learning (RL) remains challenging. Existing spectral merging methods often assume that leading singular directions contain the main task signal, while lower-energy residual components can be compressed, selected, or attenuated to reduce interference. We find that this assumption does not hold for RL task vectors: after decomposing each task vector into a leading spectral head and a residual component, both parts can independently recover substantial behavior knowledge, while exhibiting different merging properties. The head is highly concentrated and informative but more prone to sharp cross-expert conflicts, whereas the residual component is more dispersed and provides a more stable basis for aggregation. Based on this observation, we propose ResMerge, a residual-based spectral merging framework for RL experts. ResMerge first constructs a stable residual backbone with Spherical Residual Consensus Adaptation, which estimates a reliability-weighted consensus direction on the Frobenius sphere. It then reintroduces leading-head information through a Lightweight Head Correction module gated by positive cross-expert agreement. Experiments across multiple RL expert groups and capability domains show that ResMerge better preserves expert capabilities than representative task-vector and spectral merging baselines. The implementation of ResMerge is publicly available at https://github.com/sunyd0303-cpu/ResMerge-release.

Yandu Sun, Zhiyan Hou, Haokai Ma, Yuheng Jia, Junfeng Fang, Haiyun Guo, Hongyan An, weizhen wang, Jinqiao Wang• 2026

Related benchmarks

TaskDatasetResultRank
CodingHumanEval+
Pass@171.34
164
MathematicsAIME25
Accuracy20
103
MathAIME24
Accuracy23.33
57
MathematicsAIME 2024
Accuracy16.67
40
CodingLiveCodeBench
Accuracy13.5
38
MathAMC23
Score55
33
MathMATH 500
Accuracy78
25
MemoryRULER HotpotQA
Score65
24
Tool-usingLive Para
Accuracy75
22
General PerformanceAggregated Benchmarks
Overall Average47.25
22
Showing 10 of 31 rows

Other info

Follow for update