Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DoReFa-Net: Training Low Bitwidth Convolutional Neural Networks with Low Bitwidth Gradients

About

We propose DoReFa-Net, a method to train convolutional neural networks that have low bitwidth weights and activations using low bitwidth parameter gradients. In particular, during backward pass, parameter gradients are stochastically quantized to low bitwidth numbers before being propagated to convolutional layers. As convolutions during forward/backward passes can now operate on low bitwidth weights and activations/gradients respectively, DoReFa-Net can use bit convolution kernels to accelerate both training and inference. Moreover, as bit convolutions can be efficiently implemented on CPU, FPGA, ASIC and GPU, DoReFa-Net opens the way to accelerate training of low bitwidth neural network on these hardware. Our experiments on SVHN and ImageNet datasets prove that DoReFa-Net can achieve comparable prediction accuracy as 32-bit counterparts. For example, a DoReFa-Net derived from AlexNet that has 1-bit weights, 2-bit activations, can be trained from scratch using 6-bit gradients to get 46.1\% top-1 accuracy on ImageNet validation set. The DoReFa-Net AlexNet model is released publicly.

Shuchang Zhou, Yuxin Wu, Zekun Ni, Xinyu Zhou, He Wen, Yuheng Zou• 2016

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)--
3518
Image ClassificationCIFAR-10 (test)
Accuracy90.2
3381
Image ClassificationImageNet-1k (val)
Top-1 Accuracy71.4
1469
Image ClassificationImageNet (val)
Top-1 Acc52.5
1206
Image Super-resolutionSet5
PSNR16.43
692
Image ClassificationCIFAR-10--
564
Image ClassificationCIFAR-10
Accuracy90.5
508
Image Super-resolutionUrban100
PSNR15.09
406
Image ClassificationImageNet (val)
Accuracy68.1
115
Gait RecognitionGait3D
R-1 Acc64.3
84
Showing 10 of 16 rows

Other info

Follow for update