Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Maximum Likelihood Training of Score-Based Diffusion Models

About

Score-based diffusion models synthesize samples by reversing a stochastic process that diffuses data to noise, and are trained by minimizing a weighted combination of score matching losses. The log-likelihood of score-based diffusion models can be tractably computed through a connection to continuous normalizing flows, but log-likelihood is not directly optimized by the weighted combination of score matching losses. We show that for a specific weighting scheme, the objective upper bounds the negative log-likelihood, thus enabling approximate maximum likelihood training of score-based diffusion models. We empirically observe that maximum likelihood training consistently improves the likelihood of score-based diffusion models across multiple datasets, stochastic processes, and model architectures. Our best models achieve negative log-likelihoods of 2.83 and 3.76 bits/dim on CIFAR-10 and ImageNet 32x32 without any data augmentation, on a par with state-of-the-art autoregressive models on these tasks.

Yang Song, Conor Durkan, Iain Murray, Stefano Ermon• 2021

Related benchmarks

TaskDatasetResultRank
Image GenerationCIFAR-10 (test)
FID2.2
471
Image GenerationCIFAR10 32x32 (test)
FID3.98
154
Density EstimationCIFAR-10 (test)
Bits/dim2.83
134
Image GenerationImageNet 64x64
FID24.95
114
Unconditional GenerationCIFAR-10 (test)
FID2.87
102
Density EstimationImageNet 32x32 (test)
Bits per Sub-pixel3.76
66
Generative ModelingCIFAR-10 (test)
NLL (bits/dim)2.83
62
Unconditional Image GenerationCIFAR10--
33
Likelihood EstimationCIFAR-10 (test)
NLL (BPD)2.9
24
Image GenerationImageNet-32
FID8.31
20
Showing 10 of 17 rows

Other info

Code

Follow for update