Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SVD-LLM V2: Optimizing Singular Value Truncation for Large Language Model Compression

About

Despite significant advancements, the practical deployment of Large Language Models (LLMs) is often hampered by their immense sizes, highlighting the need for effective compression techniques. Singular Value Decomposition (SVD) is a promising LLM compression technique. However, existing SVD-based compression methods fall short in reducing truncation losses, leading to less competitive performance in compressed models. In this work, we introduce SVD-LLM V2, a SVD-based LLM compression method that optimizes singular value truncation in SVD compression with two techniques. First, SVD-LLM V2 proposes to use theoretical truncation loss of weight matrices to assign a unique compression ratio to each weight matrix at different layers to accommodate weight redundancy heterogeneity. Second, SVD-LLM V2 proposes loss-optimized weight truncation to ensure that the truncated singular values result in a lower and more stable truncation loss in practice. We evaluate SVD-LLM V2 on ten datasets and five LLMs at various scales. Our results show SVD-LLM V2 outperforms state-of-the-art SVD-based LLM compression methods. Our code is available at https://github.com/AIoT-MLSys-Lab/SVD-LLM

Xin Wang, Samiul Alam, Zhongwei Wan, Hui Shen, Mi Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)
PPL4.71
2333
Language ModelingWikiText-2
Perplexity (PPL)7.09
2320
Language ModelingC4
Perplexity10.47
1688
Language ModelingC4
Perplexity9.98
1565
Automatic Speech RecognitionLibriSpeech clean (test)
WER3.84
1207
Automatic Speech RecognitionLibriSpeech (test-other)
WER6.8
1206
Language ModelingWikiText
PPL21
740
Physical Commonsense ReasoningPIQA
Accuracy57
696
Question AnsweringARC-E
Accuracy33.6
523
Optical Character RecognitionOCRBench
Score0.351
433
Showing 10 of 66 rows

Other info

Follow for update