Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks
About
Large multilayer neural networks trained with backpropagation have recently achieved state-of-the-art results in a wide range of problems. However, using backprop for neural net learning still has some disadvantages, e.g., having to tune a large number of hyperparameters to the data, lack of calibrated probabilistic predictions, and a tendency to overfit the training data. In principle, the Bayesian approach to learning neural networks does not have these problems. However, existing Bayesian techniques lack scalability to large dataset and network sizes. In this work we present a novel scalable method for learning Bayesian neural networks, called probabilistic backpropagation (PBP). Similar to classical backpropagation, PBP works by computing a forward propagation of probabilities through the network and then doing a backward computation of gradients. A series of experiments on ten real-world datasets show that PBP is significantly faster than other techniques, while offering competitive predictive abilities. Our experiments also show that PBP provides accurate estimates of the posterior variance on the network weights.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Regression | Yacht | RMSE0.88 | 49 | |
| Regression | UCI ENERGY (test) | Negative Log Likelihood21.64 | 42 | |
| Regression | UCI CONCRETE (test) | Neg Log Likelihood11.43 | 37 | |
| Regression | UCI YACHT (test) | Negative Log Likelihood7.06 | 33 | |
| Regression | UCI KIN8NM (test) | NLL275.9 | 25 | |
| Regression | UCI WINE (test) | Negative Log Likelihood1.35e+3 | 24 | |
| Regression | Energy | RMSE1.58 | 13 | |
| Regression | Boston | RMSE2.89 | 12 | |
| Regression | Wine | RMSE0.64 | 12 | |
| Regression | Yacht | Avg NLL Relative %0.14 | 8 |