Cyclical Stochastic Gradient MCMC for Bayesian Deep Learning
About
The posteriors over neural network weights are high dimensional and multimodal. Each mode typically characterizes a meaningfully different representation of the data. We develop Cyclical Stochastic Gradient MCMC (SG-MCMC) to automatically explore such distributions. In particular, we propose a cyclical stepsize schedule, where larger steps discover new modes, and smaller steps characterize each mode. We also prove non-asymptotic convergence of our proposed algorithm. Moreover, we provide extensive experimental results, including ImageNet, to demonstrate the scalability and effectiveness of cyclical SG-MCMC in learning complex multimodal distributions, especially for fully Bayesian inference with modern deep neural networks.
Ruqi Zhang, Chunyuan Li, Jianyi Zhang, Changyou Chen, Andrew Gordon Wilson• 2019
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | STL-10 (test) | Accuracy81.84 | 357 | |
| Image Classification | CIFAR-10 (test) | Accuracy94.41 | 63 | |
| Image Classification | CIFAR-100 (test) | Acc77.79 | 8 | |
| Image Classification | CIFAR-10 | LPPD-0.2794 | 5 | |
| Image Classification | CIFAR-10 | LPPD-0.2794 | 5 | |
| Image Classification | Imagenette | LPPD-0.838 | 4 |
Showing 6 of 6 rows