Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization

About

Supervised Fine-Tuning (SFT) empowers Large Language Models (LLMs) with exceptional performance on specialized tasks, but it yields dense, high-dimensional delta parameters that pose severe storage and distribution challenges. Singular Value Decomposition (SVD)-based compression offers a compact representation for such delta parameters, but existing methods adopt heuristic quantization without clarifying underlying mechanisms, leading to poor generalizability. In this work, we propose PrinMix, a rigorous SVD-based framework that models quantization as an optimization problem, grounding the design in mathematical mechanisms. We first theoretically derive quantization error and identify a key singular-value-dominated scaling mechanism, which mathematically proves the necessity of mix-precision quantization. We then model the quantization scheme as a 0/1 Integer Linear Programming (ILP) problem, which yields optimal bit-budget-constrained solutions without empirical assumptions. Furthermore, PrinMix integrates a Reconstruction Target Correction (RTC) method to compensate for errors from the $\mathbf{V}$-then-$\mathbf{U}$ sequential quantization process. Extensive experiments confirm PrinMix performs well: for 7B LLMs, PrinMix outperforms SOTA Delta-CoMe on challenging benchmarks by 22.3% on AIME2024 and 6.1% on GQA.

Boya Xiong, Shuo Wang, Weifeng Ge, Guanhua Chen, Yun Chen• 2025

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@185.6
1036
Visual Question AnsweringGQA
Accuracy62.7
505
Mathematical ReasoningAIME 2024
Accuracy36.7
370
Science Question AnsweringScienceQA (SQA)
Accuracy79.4
273
Mathematical ReasoningGSM8K
Math Score56.1
197
Code GenerationMBPP
Pass@183.1
193
Code GenerationMBPP
Accuracy (%)86.9
146
Mathematical ReasoningMATH500
Accuracy (ACC)80.2
133
Visual Question AnsweringSQA
Accuracy72.1
41
Code GenerationMBPP
MBPP Score50.2
35
Showing 10 of 13 rows

Other info

Follow for update