Accelerated Information Gradient flow
About
We present a framework for Nesterov's accelerated gradient flows in probability space to design efficient mean-field Markov chain Monte Carlo (MCMC) algorithms for Bayesian inverse problems. Here four examples of information metrics are considered, including Fisher-Rao metric, Wasserstein-2 metric, Kalman-Wasserstein metric and Stein metric. For both Fisher-Rao and Wasserstein-2 metrics, we prove convergence properties of accelerated gradient flows. In implementations, we propose a sampling-efficient discrete-time algorithm for Wasserstein-2, Kalman-Wasserstein and Stein accelerated gradient flows with a restart technique. We also formulate a kernel bandwidth selection method, which learns the gradient of logarithm of density from Brownian-motion samples. Numerical experiments, including Bayesian logistic regression and Bayesian neural network, show the strength of the proposed methods compared with state-of-the-art algorithms.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bayesian Neural Networks | UCI Boston (test) | RMSE2.871 | 16 | |
| Bayesian Neural Network Regression | concrete (test) | RMSE4.44 | 12 | |
| Bayesian Neural Network Regression | WINE (test) | RMSE0.606 | 12 | |
| Bayesian Neural Network Regression | Combined (test) | RMSE4.067 | 12 | |
| Bayesian Neural Network Regression | kin8nm (test) | RMSE0.094 | 12 |