Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs

About

With the proliferation of task-specific large language models, delta compression has emerged as a method to mitigate the resource challenges of deploying numerous such models by effectively compressing the delta model parameters. Previous delta-sparsification methods either remove parameters randomly or truncate singular vectors directly after singular value decomposition (SVD). However, these methods either disregard parameter importance entirely or evaluate it with too coarse a granularity. In this work, we introduce ImPart, a novel importance-aware delta sparsification approach. Leveraging SVD, it dynamically adjusts sparsity ratios of different singular vectors based on their importance, effectively retaining crucial task-specific knowledge even at high sparsity ratios. Experiments show that ImPart achieves state-of-the-art delta sparsification performance, demonstrating $2\times$ higher compression ratio than baselines at the same performance level. When integrated with existing methods, ImPart sets a new state-of-the-art on delta quantization and model merging.

Yan Yang, Yixia Li, Hongru Wang, Xuetao Wei, Jianqiao Yu, Yun Chen, Guanhua Chen• 2025

Related benchmarks

TaskDatasetResultRank
Code GenerationHumanEval
Pass@159.76
850
Mathematical ReasoningGSM8K (test)
Accuracy64.29
797
Code GenerationHumanEval (test)
Pass@158.54
444
Mathematical ReasoningMATH (test)
Overall Accuracy13.54
433
Code GenerationMBPP (test)
Pass@168.5
276
Code GenerationMBPP
Pass@168
175
Instruction FollowingIFEval (test)
IFEval Score35.3
45
Instruction FollowingAlpacaEval (test)
Helpfulness Score2.83e+3
32
ChatAlpacaEval
Win Rate1.88e+3
25
ChatIFEval
Loose Prompt Metric33.27
15
Showing 10 of 12 rows

Other info

Code

Follow for update