Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Basis Sharing: Cross-Layer Parameter Sharing for Large Language Model Compression

About

Large Language Models (LLMs) have achieved remarkable breakthroughs. However, the huge number of parameters in LLMs require significant amount of memory storage in inference, which prevents their practical deployment in many applications. To reduce memory storage of LLMs, singular value decomposition (SVD) provides a promising solution to approximate weight matrices for compressing LLMs. In this paper, we take a step further to explore parameter sharing across different layers with SVD to achieve more effective compression for LLMs. Specifically, weight matrices in different layers are decomposed and represented as a linear combination of a set of shared basis vectors and unique coefficients. The types of weight matrices and the layer selection for basis sharing are examined when compressing LLMs to maintain the performance. Comprehensive experiments demonstrate that Basis Sharing outperforms state-of-the-art SVD-based compression approaches and parameter sharing techniques, especially under large compression ratios. Code is available at: https://github.com/TUDa-HWAI/Basis_Sharing

Jingcun Wang, Yu-Guang Chen, Ing-Chao Lin, Bing Li, Grace Li Zhang• 2024

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)7.74
2320
Commonsense ReasoningHellaSwag
Accuracy43.4
1896
Language ModelingC4
Perplexity15.03
1688
Language ModelingC4
Perplexity23.3
1565
Question AnsweringARC Challenge
Accuracy33.7
906
Language ModelingWikiText
PPL15.2
740
Question AnsweringARC Easy
Accuracy66.5
597
Question AnsweringPIQA
Accuracy70.1
505
Question AnsweringSciQ
Accuracy91
283
Language ModelingLAMBADA
Perplexity7.2
198
Showing 10 of 21 rows

Other info

Follow for update