Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Effective Model Pruning: Measure The Redundancy of Model Components

About

This article initiates the study of a basic question about model pruning. Given a vector $s$ of importance scores assigned to model components, how many of the scored components could be discarded without sacrificing performance? We propose Effective Model Pruning (EMP), which derives the desired sparsity directly from the score distribution using the notion of effective sample size from particle filtering, also known as the inverse Simpson index. Rather than prescribe a pruning criterion, EMP supplies a universal adaptive threshold derived from the distribution of the score $s$ over the model components: EMP maps $s$ to a number $N_{eff}=N_{eff}(s)$, called the effective sample size. The $N-N_{eff}$ lowest scoring components are discarded. A tight lower bound on the effective mass $s_{eff}$ (the sum of retained normalized scores) in terms of $N_{eff}$ is derived. This process yields models with a provable upper bound on the loss change relative to the original dense model. Numerical experiments are performed demonstrating this phenomenon across a variety of network architectures including MLPs, CNNs, Transformers, LLMs, and KAN. It is also shown that EMP addresses a rich set of pruning criteria such as weight magnitude, attention score, KAN importance score, and even feature-level signals such as image pixels.

Yixuan Wang, Dan P. Guralnik, Saiedeh Akbari, Warren E. Dixon• 2025

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2
Perplexity (PPL)3.662
2320
Zero-shot Evaluation7 tasks zero-shot
Mean Accuracy (Zero-shot)58.74
55
Language ModelingWikitext (test)
Avg Delta PPL0.678
4
Zero-shot Evaluation7 sub-tasks zero-shot
Avg ∆Acc (%)-0.93
4
Showing 4 of 4 rows

Other info

Follow for update