From Overfitting to Reliability: Introducing the Hierarchical Approximate Bayesian Neural Network
About
In recent years, neural networks have revolutionized various domains, yet challenges such as hyperparameter tuning and overfitting remain significant hurdles. Bayesian neural networks offer a framework to address these challenges by incorporating uncertainty directly into the model, yielding more reliable predictions, particularly for out-of-distribution data. This paper presents Hierarchical Approximate Bayesian Neural Network, a novel approach that uses a Gaussian-inverse-Wishart distribution as a hyperprior of the network's weights to increase both the robustness and performance of the model. We provide analytical representations for the predictive distribution and weight posterior, which amount to the calculation of the parameters of Student's t-distributions in closed form with linear complexity with respect to the number of weights. Our method demonstrates robust performance, effectively addressing issues of overfitting and providing reliable uncertainty estimates, particularly for out-of-distribution tasks. Experimental results indicate that HABNN not only matches but often outperforms state-of-the-art models, suggesting a promising direction for future applications in safety-critical environments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Regression | Yacht | RMSE15.28 | 49 | |
| Regression | UCI ENERGY (test) | Negative Log Likelihood3.81 | 42 | |
| Regression | UCI CONCRETE (test) | Neg Log Likelihood5.59 | 37 | |
| Regression | UCI YACHT (test) | Negative Log Likelihood4.27 | 33 | |
| Regression | UCI KIN8NM (test) | NLL2.87 | 25 | |
| Regression | UCI WINE (test) | Negative Log Likelihood2.89 | 24 | |
| Regression | UCI NAVAL (test) | Negative Log Likelihood2.88 | 21 | |
| Regression | Kin8nm | Avg NLL Relative Percentage22 | 8 | |
| Regression | UCI Concrete OOD 3x std deviation rescale non-normalized (train) | RMSE19.83 | 8 | |
| Regression | UCI Energy OOD 3x std deviation rescale non-normalized (train) | RMSE9.99 | 8 |