Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Layer as Puzzle Pieces: Compressing Large Language Models through Layer Concatenation

About

Large Language Models excel at natural language processing tasks, but their massive size leads to high computational and storage demands. Recent works have sought to reduce their model size through layer-wise structured pruning. However, they tend to ignore retaining the capabilities in the pruned part. In this work, we re-examine structured pruning paradigms and uncover several key limitations: 1) notable performance degradation due to direct layer removal, 2) incompetent linear weight layer aggregation, and 3) the lack of effective post-training recovery mechanisms. To address these limitations, we propose CoMe, including a progressive layer pruning framework with a Concatenation-based Merging technology and a hierarchical distillation post-training process. Specifically, we introduce a channel sensitivity metric that utilizes activation intensity and weight norms for fine-grained channel selection. Subsequently, we employ a concatenation-based layer merging method to fuse the most critical channels across adjacent layers, enabling progressive model size reduction. Finally, we propose a hierarchical distillation protocol that leverages the correspondences between the original and pruned model layers established during pruning, thereby enabling efficient knowledge transfer. Experiments on seven benchmarks show that CoMe achieves state-of-the-art performance; when pruning 30% of LLaMA-2-7b's parameters, the pruned model retains 83% of its original average accuracy. Our code is available at https://github.com/MPI-Lab/CoMe.

Fei Wang, Li Shen, Liang Ding, Chao Xue, Ye Liu, Changxing Ding• 2025

Related benchmarks

TaskDatasetResultRank
Question AnsweringARC Challenge
Accuracy40.44
749
Question AnsweringARC Easy
Accuracy64.23
386
Question AnsweringWinoGrande (WG)
Accuracy70.96
98
Question AnsweringPIQA
Accuracy74.05
83
Multiple-choice Question AnsweringHellaSwag
Accuracy68.68
59
Question AnsweringWinoGrande, HellaSwag, ARC-e, ARC-c, PIQA Average
Avg Accuracy62.93
35
Showing 6 of 6 rows

Other info

Follow for update