Learning N:M Fine-grained Structured Sparse Neural Networks From Scratch
About
Sparsity in Deep Neural Networks (DNNs) has been widely studied to compress and accelerate the models on resource-constrained environments. It can be generally categorized into unstructured fine-grained sparsity that zeroes out multiple individual weights distributed across the neural network, and structured coarse-grained sparsity which prunes blocks of sub-networks of a neural network. Fine-grained sparsity can achieve a high compression ratio but is not hardware friendly and hence receives limited speed gains. On the other hand, coarse-grained sparsity cannot concurrently achieve both apparent acceleration on modern GPUs and decent performance. In this paper, we are the first to study training from scratch an N:M fine-grained structured sparse network, which can maintain the advantages of both unstructured fine-grained sparsity and structured coarse-grained sparsity simultaneously on specifically designed GPUs. Specifically, a 2:4 sparse network could achieve 2x speed-up without performance drop on Nvidia A100 GPUs. Furthermore, we propose a novel and effective ingredient, sparse-refined straight-through estimator (SR-STE), to alleviate the negative influence of the approximated gradients computed by vanilla STE during optimization. We also define a metric, Sparse Architecture Divergence (SAD), to measure the sparse network's topology change during the training process. Finally, We justify SR-STE's advantages with SAD and demonstrate the effectiveness of SR-STE by performing comprehensive experiments on various tasks. Source codes and models are available at https://github.com/NM-sparsity/NM-sparsity.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | ImageNet-1k (val) | Top-1 Accuracy79.6 | 1469 | |
| Image Classification | ImageNet (val) | Top-1 Acc77.6 | 1206 | |
| Instance Segmentation | COCO 2017 (val) | -- | 1201 | |
| Object Detection | COCO (val) | mAP37.7 | 633 | |
| Image Classification | ImageNet-1K | Top-1 Acc75.1 | 600 | |
| Image Classification | ImageNet | Top-1 Accuracy77.3 | 431 | |
| Object Detection | COCO | mAP38.2 | 137 | |
| Question Answering | SQuAD v1.1 (val) | F1 Score88.52 | 70 | |
| Question Answering | SQuAD | F1 Score91.1 | 34 | |
| Question Answering | SQuAD (val) | F1 Score91.1 | 26 |