Diet Your LLM: Dimension-wise Global Pruning of LLMs via Merging Task-specific Importance Score
About
Large language models (LLMs) have demonstrated remarkable capabilities, but their massive scale poses significant challenges for practical deployment. Structured pruning offers a promising solution by removing entire dimensions or layers, yet existing methods face critical trade-offs: task-agnostic approaches cannot adapt to task-specific requirements, while task-aware methods require costly training to learn task adaptability. We propose DIET (Dimension-wise global pruning of LLMs via merging Task-wise importance scores), a training-free structured pruning method that combines dimension-level granularity with task-aware selection. DIET profiles activation magnitudes across tasks using only 100 samples per task, then applies majority voting to construct a single global mask. DIET does not require large costs from pre-computation or training. Experiments on seven zero-shot benchmarks using Gemma-2 2B and 9B models demonstrate the effectiveness of DIET; for example, at 20% sparsity on Gemma-2 2B, DIET achieves near 10% average accuracy improvement, compared to previous state-of-the-art structured pruning methods. This advantage persists across various sparsity levels and model scales, positioning DIET as a practical and robust choice for structured LLM pruning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Commonsense Reasoning | HellaSwag | -- | 1891 | |
| Commonsense Reasoning | WinoGrande | -- | 1085 | |
| Instruction Following | IFEval | -- | 625 | |
| Natural Language Inference | RTE | Accuracy54.9 | 448 | |
| Question Answering | ARC-E | Accuracy69 | 416 | |
| Question Answering | BoolQ | -- | 317 | |
| Question Answering | ARC-C | Accuracy38.4 | 192 | |
| Recognizing Textual Entailment | RTE | Accuracy62.1 | 47 | |
| Natural Language Understanding | NLP Suite (BoolQ, RTE, HellaSwag, WinoG, ARC-E, ARC-C, OpenBookQA) zero-shot | Average Accuracy60.9 | 41 | |
| Science Question Answering | ARC Easy | Accuracy (Character-level)79.1 | 20 |