Improved off-policy training of diffusion samplers
About
We study the problem of training diffusion models to sample from a distribution with a given unnormalized density or energy function. We benchmark several diffusion-structured inference methods, including simulation-based variational approaches and off-policy methods (continuous generative flow networks). Our results shed light on the relative advantages of existing algorithms while bringing into question some claims from past work. We also propose a novel exploration strategy for off-policy methods, based on local search in the target space with the use of a replay buffer, and show that it improves the quality of samples on a variety of target distributions. Our code for the sampling methods and benchmarks studied is made public at https://github.com/GFNOrg/gfn-diffusion as a base for future work on diffusion models for amortized inference.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Unconditional modeling | 25GMM d = 2 | Delta Log Z1.176 | 30 | |
| Unconditional modeling | Funnel d = 10 | Delta log Z0.642 | 30 | |
| Unconditional modeling | Manywell d = 32 | Δ log Z7.46 | 29 | |
| Conditional Sampling | MNIST pretrained VAE decoder (test) | log Z-99.472 | 15 | |
| Unconditional modeling | Log-Gaussian Cox process d = 1600 | Delta log Z471.1 | 13 |