Score-Based Diffusion meets Annealed Importance Sampling
About
More than twenty years after its introduction, Annealed Importance Sampling (AIS) remains one of the most effective methods for marginal likelihood estimation. It relies on a sequence of distributions interpolating between a tractable initial distribution and the target distribution of interest which we simulate from approximately using a non-homogeneous Markov chain. To obtain an importance sampling estimate of the marginal likelihood, AIS introduces an extended target distribution to reweight the Markov chain proposal. While much effort has been devoted to improving the proposal distribution used by AIS, an underappreciated issue is that AIS uses a convenient but suboptimal extended target distribution. We here leverage recent progress in score-based generative modeling (SGM) to approximate the optimal extended target distribution minimizing the variance of the marginal likelihood estimate for AIS proposals corresponding to the discretization of Langevin and Hamiltonian dynamics. We demonstrate these novel, differentiable, AIS procedures on a number of synthetic benchmark distributions and variational auto-encoders.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Variational Inference | MNIST (test) | Negative ELBO-90.1 | 10 | |
| log Z estimation | MNIST downsampled (test) | Log Z Absolute Error0.11 | 9 | |
| log Z estimation | Student-T Dim-500 | Log Likelihood (Z)0.06 | 8 | |
| Marginal Likelihood Estimation | Gaussian mixture target 8 components Dim-200 (test) | Log Z0.2 | 8 | |
| Marginal Likelihood Estimation | Gaussian mixture target 8 components Dim-500 (test) | Log Z1.01 | 8 | |
| Variational Inference | Student-t distribution toy example dim=20 | Log Z0.00e+0 | 8 | |
| Marginal Likelihood Estimation | Gaussian mixture target 8 components Dim-20 (test) | Log Z0.01 | 8 | |
| Variational Inference | Student-t distribution toy example dim=200 | Log Evidence (Z)-0.1 | 8 |