Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Oops I Took A Gradient: Scalable Sampling for Discrete Distributions

About

We propose a general and scalable approximate sampling strategy for probabilistic models with discrete variables. Our approach uses gradients of the likelihood function with respect to its discrete inputs to propose updates in a Metropolis-Hastings sampler. We show empirically that this approach outperforms generic samplers in a number of difficult settings including Ising models, Potts models, restricted Boltzmann machines, and factorial hidden Markov models. We also demonstrate the use of our improved sampler for training deep energy-based models on high dimensional discrete data. This approach outperforms variational auto-encoders and existing energy-based models. Finally, we give bounds showing that our approach is near-optimal in the class of samplers which propose local updates.

Will Grathwohl, Kevin Swersky, Milad Hashemi, David Duvenaud, Chris J. Maddison• 2021

Related benchmarks

TaskDatasetResultRank
Graph generationEgo-small (test)
Degree0.095
19
Conditional estimationDynamic MNIST (test)
Test Log Likelihood-80.51
18
Generative ModelingOmniglot (test)
Log Likelihood-94.72
8
RegressionBreast Cancer (UCI) (test)
Avg Test Log-likelihood0.0241
5
RegressionCOMPAS UCI (test)
Average Test Log-likelihood0.2265
5
RegressionHIV UCI (test)
Avg Test Log-likelihood0.7025
5
RegressionBlog UCI (test)
Avg Test Log-likelihood0.2799
5
Traveling Salesman Problemeil14
Cost370.7
5
RBM learningMNIST (test)
Log Likelihood (AIS)-387.3
4
RBM learningEMNIST (test)
Log Likelihood (AIS)-591
4
Showing 10 of 14 rows

Other info

Follow for update