Multilevel and Sequential Monte Carlo for Training-Free Diffusion Guidance
About
We address the problem of accurate, training-free guidance for conditional generation in trained diffusion models. Existing methods typically rely on point-estimates to approximate the posterior score, often resulting in biased approximations that fail to capture multimodality inherent to the reverse process of diffusion models. We propose a sequential Monte Carlo (SMC) framework that constructs an unbiased estimator of $p_\theta(y|x_t)$ by integrating over the full denoising distribution via Monte Carlo approximation. To ensure computational tractability, we incorporate variance-reduction schemes based on Multi-Level Monte Carlo (MLMC). Our approach achieves new state-of-the-art results for training-free guidance on CIFAR-10 class-conditional generation, achieving $95.6\%$ accuracy with $3\times$ lower cost-per-success than baselines. On ImageNet, our algorithm achieves $1.5\times$ cost-per-success advantage over existing methods.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Label Guidance | CIFAR-10 (evaluation) | Accuracy0.956 | 6 | |
| Conditional Image Generation | ImageNet 256x256 (Classes 111, 130, 207, 222, 333, 444) | Success Rate58.3 | 2 |