Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Layer-adaptive sparsity for the Magnitude-based Pruning

About

Recent discoveries on neural network pruning reveal that, with a carefully chosen layerwise sparsity, a simple magnitude-based pruning achieves state-of-the-art tradeoff between sparsity and performance. However, without a clear consensus on "how to choose," the layerwise sparsities are mostly selected algorithm-by-algorithm, often resorting to handcrafted heuristics or an extensive hyperparameter search. To fill this gap, we propose a novel importance score for global pruning, coined layer-adaptive magnitude-based pruning (LAMP) score; the score is a rescaled version of weight magnitude that incorporates the model-level $\ell_2$ distortion incurred by pruning, and does not require any hyperparameter tuning or heavy computation. Under various image classification setups, LAMP consistently outperforms popular existing schemes for layerwise sparsity selection. Furthermore, we observe that LAMP continues to outperform baselines even in weight-rewinding setups, while the connectivity-oriented layerwise sparsity (the strongest baseline overall) performs worse than a simple global magnitude-based pruning in this case. Code: https://github.com/jaeho-lee/layer-adaptive-sparsity

Jaeho Lee, Sejun Park, Sangwoo Mo, Sungsoo Ahn, Jinwoo Shin• 2020

Related benchmarks

TaskDatasetResultRank
Commonsense ReasoningWinoGrande
Accuracy74.74
1442
Natural Language InferenceRTE
Accuracy62.45
590
Question AnsweringOBQA
Accuracy35.4
347
Science Question AnsweringARC-C
Accuracy49.57
261
Science Question AnsweringARC-E
Accuracy76.4
240
Language ModelingWikiText
Word Perplexity4.98
234
Question AnsweringBoolQ
Accuracy79.15
201
Image ClassificationCIFAR100 (test)
Accuracy23.11
98
Question AnsweringOBQA
Accuracy (Normalized)35
29
Image ClassificationCIFAR10 (test)
Top-1 Accuracy79.08
28
Showing 10 of 14 rows

Other info

Follow for update