Weight Concentration Regularization for Improving Pruning Robustness Under High Sparsity
About
Deep neural networks achieve outstanding performance across vision and language tasks, yet their large parameter counts limit deployment in resource-constrained settings. One-shot pruning reduces model size without retraining, but models trained with standard objectives often suffer substantial accuracy drops under aggressive sparsity. Prior work mitigates this drop along two directions: regularizers such as $\ell_1$ and DeepHoyer that shape the weight distribution during training, and pruning-robust optimizers such as SAM, CrAM, and S$^2$SAM that flatten the loss landscape. However, existing regularizers either shrink all weights uniformly ($\ell_1$) or induce scale-invariant sparsity (DeepHoyer), without concentrating weight energy onto a small set of informative parameters. We propose a Weight Concentration Regularizer (WCR), a training-time regularizer that amplifies the magnitude of a small subset of parameters while driving the remainder toward zero, so that magnitude pruning predominantly removes parameters with negligible functional contribution. We provide a convergence analysis and evaluate WCR on LLM fine-tuning, image classification, and medical segmentation, demonstrating consistent improvements in pruning robustness across architectures and compatibility with existing pruning-robust optimizers.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 (test) | Accuracy95.23 | 882 | |
| Image Classification | CIFAR-10 | Accuracy95.23 | 875 | |
| Image Classification | Tiny ImageNet (test) | Accuracy87.31 | 722 | |
| Image Classification | SVHN (test) | Accuracy96.96 | 470 | |
| Image Classification | CIFAR-100 (test) | Accuracy89.84 | 295 | |
| Commonsense Reasoning | Commonsense Reasoning (BoolQ, PIQA, SIQA, HellaS., WinoG., ARC-e, ARC-c, OBQA) (test) | BoolQ Accuracy65.96 | 238 | |
| Image Classification | SVHN | ACC (Accuracy)96.96 | 58 | |
| Segmentation | LGG MRI | F1 Score92.95 | 29 |