Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SLANG: Fast Structured Covariance Approximations for Bayesian Deep Learning with Natural Gradient

About

Uncertainty estimation in large deep-learning models is a computationally challenging task, where it is difficult to form even a Gaussian approximation to the posterior distribution. In such situations, existing methods usually resort to a diagonal approximation of the covariance matrix despite, the fact that these matrices are known to result in poor uncertainty estimates. To address this issue, we propose a new stochastic, low-rank, approximate natural-gradient (SLANG) method for variational inference in large, deep models. Our method estimates a "diagonal plus low-rank" structure based solely on back-propagated gradients of the network log-likelihood. This requires strictly less gradient computations than methods that compute the gradient of the whole variational objective. Empirical evaluations on standard benchmarks confirm that SLANG enables faster and more accurate estimation of uncertainty than mean-field methods, and performs comparably to state-of-the-art methods.

Aaron Mishkin, Frederik Kunstner, Didrik Nielsen, Mark Schmidt, Mohammad Emtiyaz Khan• 2018

Related benchmarks

TaskDatasetResultRank
RegressionUCI ENERGY (test)
Negative Log Likelihood1.12
42
RegressionUCI CONCRETE (test)
Neg Log Likelihood-3.13
37
RegressionUCI YACHT (test)
Negative Log Likelihood-1.88
33
RegressionUCI POWER (test)
Negative Log Likelihood-2.84
29
RegressionEnergy UCI (test)
RMSE0.64
27
RegressionBoston UCI (test)
RMSE3.21
26
RegressionUCI KIN8NM (test)--
25
RegressionUCI WINE (test)
Negative Log Likelihood-0.97
24
RegressionConcrete UCI (test)
RMSE5.58
21
RegressionUCI NAVAL (test)
Negative Log Likelihood4.76
21
Showing 10 of 15 rows

Other info

Follow for update