A Langevin-like Sampler for Discrete Distributions
About
We propose discrete Langevin proposal (DLP), a simple and scalable gradient-based proposal for sampling complex high-dimensional discrete distributions. In contrast to Gibbs sampling-based methods, DLP is able to update all coordinates in parallel in a single step and the magnitude of changes is controlled by a stepsize. This allows a cheap and efficient exploration in the space of high-dimensional and strongly correlated variables. We prove the efficiency of DLP by showing that the asymptotic bias of its stationary distribution is zero for log-quadratic distributions, and is small for distributions that are close to being log-quadratic. With DLP, we develop several variants of sampling algorithms, including unadjusted, Metropolis-adjusted, stochastic and preconditioned versions. DLP outperforms many popular alternatives on a wide variety of tasks, including Ising models, restricted Boltzmann machines, deep energy-based models, binary neural networks and language generation.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Modeling | Omniglot (test) | NLL94.72 | 27 | |
| Conditional estimation | Dynamic MNIST (test) | Test Log Likelihood-80.12 | 18 | |
| Generative Modeling | Omniglot (test) | Log Likelihood-99.243 | 8 | |
| Discrete Image Modelling | MNIST Static (test) | NLL80.01 | 6 | |
| Discrete Image Modelling | MNIST dynamic (test) | NLL80.51 | 6 | |
| Training Ising Models | Lattice Ising model D=10^2 | Mean negative log-RMSE (σ=0.1)4.8 | 5 | |
| Training Ising Models | Lattice Ising model D=9^2 | Mean negative log-RMSE (σ=-0.1)4.8 | 5 | |
| RBM learning | Caltech Silhouettes (test) | Log Likelihood (AIS)-427.3 | 4 | |
| RBM learning | MNIST (test) | Log Likelihood (AIS)-278.4 | 4 | |
| RBM learning | EMNIST (test) | Log Likelihood (AIS)-324.3 | 4 |