Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Enhancing Delta Compression in LLMs via SVD-based Quantization Error Minimization

About

Supervised Fine-Tuning (SFT) empowers Large Language Models (LLMs) with exceptional performance on specialized tasks, but it yields dense, high-dimensional delta parameters that pose severe storage and distribution challenges. Singular Value Decomposition (SVD)-based compression offers a compact representation for such delta parameters, but existing methods adopt heuristic quantization without clarifying underlying mechanisms, leading to poor generalizability. In this work, we propose PrinMix, a rigorous SVD-based framework that models quantization as an optimization problem, grounding the design in mathematical mechanisms. We first theoretically derive quantization error and identify a key singular-value-dominated scaling mechanism, which mathematically proves the necessity of mix-precision quantization. We then model the quantization scheme as a 0/1 Integer Linear Programming (ILP) problem, which yields optimal bit-budget-constrained solutions without empirical assumptions. Furthermore, PrinMix integrates a Reconstruction Target Correction (RTC) method to compensate for errors from the $\mathbf{V}$-then-$\mathbf{U}$ sequential quantization process. Extensive experiments confirm PrinMix performs well: for 7B LLMs, PrinMix outperforms SOTA Delta-CoMe on challenging benchmarks by 22.3% on AIME2024 and 6.1% on GQA.

Boya Xiong, Shuo Wang, Weifeng Ge, Guanhua Chen, Yun Chen• 2025

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@185.6
850
Visual Question AnsweringGQA
Accuracy62.7
374
Mathematical ReasoningAIME 2024
Accuracy36.7
251
Code GenerationMBPP
Pass@183.1
175
Mathematical ReasoningGSM8K
Math Score56.1
171
Code GenerationMBPP
Accuracy (%)86.9
146
Mathematical ReasoningMATH500
Accuracy (ACC)80.2
133
Science Question AnsweringScienceQA (SQA)
Accuracy79.4
128
Code GenerationMBPP
MBPP Score50.2
35
Visual Question AnsweringSQA
Accuracy72.1
23
Showing 10 of 13 rows

Other info

Follow for update