Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PACT: Parameterized Clipping Activation for Quantized Neural Networks

About

Deep learning algorithms achieve high classification accuracy at the expense of significant computation cost. To address this cost, a number of quantization schemes have been proposed - but most of these techniques focused on quantizing weights, which are relatively smaller in size compared to activations. This paper proposes a novel quantization scheme for activations during training - that enables neural networks to work well with ultra low precision weights and activations without any significant accuracy degradation. This technique, PArameterized Clipping acTivation (PACT), uses an activation clipping parameter $\alpha$ that is optimized during training to find the right quantization scale. PACT allows quantizing activations to arbitrary bit precisions, while achieving much better accuracy relative to published state-of-the-art quantization schemes. We show, for the first time, that both weights and activations can be quantized to 4-bits of precision while still achieving accuracy comparable to full precision networks across a range of popular models and datasets. We also show that exploiting these reduced-precision computational units in hardware can enable a super-linear improvement in inferencing performance due to a significant reduction in the area of accelerator compute engines coupled with the ability to retain the quantized model and activation data in on-chip memories.

Jungwook Choi, Zhuo Wang, Swagath Venkataramani, Pierce I-Jen Chuang, Vijayalakshmi Srinivasan, Kailash Gopalakrishnan• 2018

Related benchmarks

TaskDatasetResultRank
Language ModelingWikiText-2 (test)
PPL17.49
1541
Image ClassificationImageNet-1k (val)
Top-1 Accuracy76.5
1453
Image ClassificationImageNet (val)
Top-1 Acc76.5
1206
Language ModelingWikiText-103 (test)
Perplexity16.76
524
Natural Language UnderstandingGLUE
SST-289.45
452
Image ClassificationImageNet-1k (val)
Top-1 Acc69.2
287
SummarizationXSum (test)
ROUGE-216.6
231
Image ClassificationImageNet-1k (val)
Top-1 Acc61.4
188
Language ModelingPenn Treebank (PTB) (test)
Perplexity16.11
120
Image ClassificationImageNet (val)
Accuracy69.2
115
Showing 10 of 13 rows

Other info

Follow for update