Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Training Binary Neural Networks using the Bayesian Learning Rule

About

Neural networks with binary weights are computation-efficient and hardware-friendly, but their training is challenging because it involves a discrete optimization problem. Surprisingly, ignoring the discrete nature of the problem and using gradient-based methods, such as the Straight-Through Estimator, still works well in practice. This raises the question: are there principled approaches which justify such methods? In this paper, we propose such an approach using the Bayesian learning rule. The rule, when applied to estimate a Bernoulli distribution over the binary weights, results in an algorithm which justifies some of the algorithmic choices made by the previous approaches. The algorithm not only obtains state-of-the-art performance, but also enables uncertainty estimation for continual learning to avoid catastrophic forgetting. Our work provides a principled approach for training binary neural networks which justifies and extends existing approaches.

Xiangming Meng, Roman Bachmann, Mohammad Emtiyaz Khan• 2020

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100
Accuracy70.33
691
Image ClassificationCIFAR-10
Accuracy92.28
564
Image ClassificationTiny-ImageNet
Accuracy55.84
269
Image ClassificationImagenette
Accuracy79.59
36
Lifelong Object RecognitionOpenLORIS-Object (12-task stream)
Mean Accuracy89.37
24
Out-of-Distribution DetectionOpenLORIS-Object (held-out toy class)
Aleatoric AUC0.99
24
Online Continual LearningPermuted MNIST 1000-tasks (last 5 tasks)
Mean Accuracy (5 Tasks)86.61
16
Out-of-Distribution DetectionMNIST vs Fashion-MNIST 1000-tasks Permuted
OOD Detection AUC80
15
Image ClassificationPermuted-MNIST Single-task
Accuracy (1 Task)93.22
8
Showing 9 of 9 rows

Other info

Follow for update