Understanding MCMC Dynamics as Flows on the Wasserstein Space
About
It is known that the Langevin dynamics used in MCMC is the gradient flow of the KL divergence on the Wasserstein space, which helps convergence analysis and inspires recent particle-based variational inference methods (ParVIs). But no more MCMC dynamics is understood in this way. In this work, by developing novel concepts, we propose a theoretical framework that recognizes a general MCMC dynamics as the fiber-gradient Hamiltonian flow on the Wasserstein space of a fiber-Riemannian Poisson manifold. The "conservation + convergence" structure of the flow gives a clear picture on the behavior of general MCMC dynamics. The framework also enables ParVI simulation of MCMC dynamics, which enriches the ParVI family with more efficient dynamics, and also adapts ParVI advantages to MCMCs. We develop two ParVI methods for a particular MCMC dynamics and demonstrate the benefits in experiments.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Bayesian Neural Network Inference | Boston UCI (test) | Test Log-Likelihood-2.42 | 8 | |
| Bayesian Neural Network Inference | Concrete UCI (test) | Test Log-Likelihood-2.95 | 8 | |
| Bayesian Neural Network Inference | Yacht UCI (test) | Test Log-Likelihood-0.73 | 8 | |
| Bayesian Neural Network Inference | Power Plant (UCI) (test) | Test Log-Likelihood-2.77 | 8 | |
| Bayesian Neural Network Inference | Energy UCI (test) | Test Log-Likelihood-1.36 | 8 | |
| Bayesian Neural Network Inference | Kin8nm UCI (test) | Test Log-Likelihood1.23 | 8 | |
| Bayesian Neural Network Inference | YearPredictionMSD UCI (test) | Test Log-Likelihood-3.6 | 8 | |
| Bayesian Neural Network Inference | Protein UCI (test) | Test Log-Likelihood-3.73 | 8 |