Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

About

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison• 2021

Related benchmarks

TaskDatasetResultRank
Conditional estimationDynamic MNIST (test)
Test Log Likelihood-80.51
18
Graph generationEgo-small (test)
Degree0.095
11
Generative ModelingOmniglot (test)
Log Likelihood-94.72
8
RegressionBreast Cancer (UCI) (test)
Avg Test Log-likelihood0.0241
5
RegressionCOMPAS UCI (test)
Average Test Log-likelihood0.2265
5
RegressionHIV UCI (test)
Avg Test Log-likelihood0.7025
5
RegressionBlog UCI (test)
Avg Test Log-likelihood0.2799
5
Traveling Salesman Problemeil14
Cost370.7
5
RBM learningMNIST (test)
Log Likelihood (AIS)-387.3
4
RBM learningEMNIST (test)
Log Likelihood (AIS)-591
4
Showing 10 of 14 rows

Other info

Follow for update