SVD-LLM: Truncation-aware Singular Value Decomposition for Large Language Model Compression

About

The advancements in Large Language Models (LLMs) have been hindered by their substantial sizes, which necessitates LLM compression methods for practical deployment. Singular Value Decomposition (SVD) offers a promising solution for LLM compression. However, state-of-the-art SVD-based LLM compression methods have two key limitations: truncating smaller singular values may lead to higher compression loss, and the lack of update on the compressed weights after SVD truncation. In this work, we propose SVD-LLM, a SVD-based post-training LLM compression method that addresses the limitations of existing methods. SVD-LLM incorporates a truncation-aware data whitening technique to ensure a direct mapping between singular values and compression loss. Moreover, SVD-LLM adopts a parameter update with sequential low-rank approximation to compensate for the accuracy degradation after SVD compression. We evaluate SVD-LLM on 10 datasets and seven models from three different LLM families at three different scales. Our results demonstrate the superiority of SVD-LLM over state-of-the-arts, especially at high model compression ratios. Our code is available at https://github.com/AIoT-MLSys-Lab/SVD-LLM

Xin Wang, Yu Zheng, Zhongwei Wan, Mi Zhang• 2024

Related benchmarks

Task	Dataset	Result
Language Modeling	WikiText2	Perplexity6.65	3785
Language Modeling	WikiText-2 (test)	PPL6.34	2333
Language Modeling	WikiText-2	Perplexity (PPL)7.94	2320
Commonsense Reasoning	HellaSwag	Accuracy60.4	1896
Language Modeling	C4	Perplexity11.16	1688
Language Modeling	C4	Perplexity10.8	1565
Commonsense Reasoning	WinoGrande	--	1442
Mathematical Reasoning	GSM8K	Accuracy64	1398
Language Modeling	PTB	Perplexity16.22	1234
Code Generation	HumanEval	Pass@155	1043

Showing 10 of 136 rows

...

Other info

Follow for update

@wizwand_team Discord