Effective Model Pruning: Measure The Redundancy of Model Components

About

This article initiates the study of a basic question about model pruning. Given a vector $s$ of importance scores assigned to model components, how many of the scored components could be discarded without sacrificing performance? We propose Effective Model Pruning (EMP), which derives the desired sparsity directly from the score distribution using the notion of effective sample size from particle filtering, also known as the inverse Simpson index. Rather than prescribe a pruning criterion, EMP supplies a universal adaptive threshold derived from the distribution of the score $s$ over the model components: EMP maps $s$ to a number $N_{eff}=N_{eff}(s)$, called the effective sample size. The $N-N_{eff}$ lowest scoring components are discarded. A tight lower bound on the effective mass $s_{eff}$ (the sum of retained normalized scores) in terms of $N_{eff}$ is derived. This process yields models with a provable upper bound on the loss change relative to the original dense model. Numerical experiments are performed demonstrating this phenomenon across a variety of network architectures including MLPs, CNNs, Transformers, LLMs, and KAN. It is also shown that EMP addresses a rich set of pruning criteria such as weight magnitude, attention score, KAN importance score, and even feature-level signals such as image pixels.

Yixuan Wang, Dan P. Guralnik, Saiedeh Akbari, Warren E. Dixon• 2025

Related benchmarks

Task	Dataset	Result
Language Modeling	WikiText-2	Perplexity (PPL)3.662	2862
Zero-shot Evaluation	7 tasks zero-shot	Mean Accuracy (Zero-shot)58.74	223
Language Modeling	Wikitext (test)	Avg Delta PPL0.678	4
Zero-shot Evaluation	7 sub-tasks zero-shot	Avg ∆Acc (%)-0.93	4

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord