Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Quantization and Training of Neural Networks for Efficient Integer-Arithmetic-Only Inference

About

The rising popularity of intelligent mobile devices and the daunting computational cost of deep learning-based models call for efficient and accurate on-device inference schemes. We propose a quantization scheme that allows inference to be carried out using integer-only arithmetic, which can be implemented more efficiently than floating point inference on commonly available integer-only hardware. We also co-design a training procedure to preserve end-to-end model accuracy post quantization. As a result, the proposed quantization scheme improves the tradeoff between accuracy and on-device latency. The improvements are significant even on MobileNets, a model family known for run-time efficiency, and are demonstrated in ImageNet classification and COCO detection on popular CPUs.

Benoit Jacob, Skirmantas Kligys, Bo Chen, Menglong Zhu, Matthew Tang, Andrew Howard, Hartwig Adam, Dmitry Kalenichenko• 2017

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1k (val)
Top-1 Accuracy70.9
1453
Image ClassificationImageNet (val)
Top-1 Acc67.3
1206
Instance SegmentationCOCO 2017 (val)--
1144
Visual Question AnsweringTextVQA
Accuracy84.8
1117
Image Super-resolutionManga109
PSNR30.95
656
Object DetectionCOCO (val)
mAP40.4
613
Single Image Super-ResolutionUrban100
PSNR26.49
500
Single Image Super-ResolutionSet5
PSNR32.39
352
OCR EvaluationOCRBench
Score848
296
Single Image Super-ResolutionSet14
PSNR28.77
252
Showing 10 of 47 rows

Other info

Follow for update