Importance Estimation for Neural Network Pruning
About
Structural pruning of neural network parameters reduces computation, energy, and memory transfer costs during inference. We propose a novel method that estimates the contribution of a neuron (filter) to the final loss and iteratively removes those with smaller scores. We describe two variations of our method using the first and second-order Taylor expansions to approximate a filter's contribution. Both methods scale consistently across any network layer without requiring per-layer sensitivity analysis and can be applied to any kind of layer, including skip connections. For modern networks trained on ImageNet, we measured experimentally a high (>93%) correlation between the contribution computed by our methods and a reliable estimate of the true importance. Pruning with the proposed methods leads to an improvement over state-of-the-art in terms of accuracy, FLOPs, and parameter reduction. On ResNet-101, we achieve a 40% FLOPS reduction by removing 30% of the parameters, with a loss of 0.02% in the top-1 accuracy on ImageNet. Code is available at https://github.com/NVlabs/Taylor_pruning.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | CIFAR-10 (test) | -- | 3381 | |
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy71.7 | 1469 | |
| Image Classification | ImageNet 1k (test) | Top-1 Accuracy76.43 | 848 | |
| Image Classification | ImageNet-1k (val) | Top-1 Acc80.55 | 706 | |
| Image Classification | ImageNet-1K | Top-1 Acc68.38 | 600 | |
| Image Denoising | SIDD (test) | PSNR34.8082 | 102 | |
| Image Classification | ImageNet (val) | Top-1 Accuracy74.5 | 76 | |
| Image Classification | ImageNet (val) | Top-1 Accuracy77.4 | 68 | |
| Image Classification | ImageNet | Top-1 Accuracy73.31 | 60 | |
| Image Classification | ImageNet (val) | Top-1 Accuracy (Baseline)76.18 | 59 |