Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LaCo: Large Language Model Pruning via Layer Collapse

About

Large language models (LLMs) based on transformer are witnessing a notable trend of size expansion, which brings considerable costs to both model training and inference. However, existing methods such as model quantization, knowledge distillation, and model pruning are constrained by various issues, including hardware support limitations, the need for extensive training, and alterations to the model internal structure. In this paper, we propose a concise layer-wise structured pruner called \textit{Layer Collapse (LaCo)}, in which rear model layers collapse into a prior layer, enabling a rapid reduction in model size while preserving the model structure. Comprehensive experiments show that our method maintains an average task performance of over 80\% at pruning ratios of 25-30\%, significantly outperforming existing state-of-the-art structured pruning methods. We also conduct post-training experiments to confirm that the \textit{LaCo} effectively inherits the parameters of the original model. Additionally, we perform ablation studies on various settings of \textit{LaCo}. Finally, we discuss our motivation from the perspective of layer-wise similarity and evaluate the performance of the pruned LLMs across various pruning ratios\footnote{\url{https://github.com/yangyifei729/LaCo}}.

Yifei Yang, Zouying Cao, Hai Zhao• 2024

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText2
Perplexity13.97
2839
Language ModelingWikiText-2 (test)
PPL13.97
1949
Commonsense ReasoningHellaSwag
Accuracy90.7
1891
Language ModelingWikiText-2
Perplexity (PPL)7.14
1624
Commonsense ReasoningWinoGrande
Accuracy82.8
1085
Commonsense ReasoningPIQA
Accuracy72.42
751
Language ModelingWikiText2 v1 (test)
Perplexity13.97
383
Medical Question AnsweringMedMCQA
Accuracy55.6
346
Physical Interaction Question AnsweringPIQA
Accuracy85.7
333
Reading ComprehensionRACE high
Accuracy56.92
295
Showing 10 of 36 rows

Other info

Follow for update