Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Structure-Guided Adversarial Training of Diffusion Models

About

Diffusion models have demonstrated exceptional efficacy in various generative applications. While existing models focus on minimizing a weighted sum of denoising score matching losses for data distribution modeling, their training primarily emphasizes instance-level optimization, overlooking valuable structural information within each mini-batch, indicative of pair-wise relationships among samples. To address this limitation, we introduce Structure-guided Adversarial training of Diffusion Models (SADM). In this pioneering approach, we compel the model to learn manifold structures between samples in each training batch. To ensure the model captures authentic manifold structures in the data distribution, we advocate adversarial training of the diffusion generator against a novel structure discriminator in a minimax game, distinguishing real manifold structures from the generated ones. SADM substantially improves existing diffusion transformers (DiT) and outperforms existing methods in image generation and cross-domain fine-tuning tasks across 12 datasets, establishing a new state-of-the-art FID of 1.58 and 2.11 on ImageNet for class-conditional image generation at resolutions of 256x256 and 512x512, respectively.

Ling Yang, Haotian Qian, Zhilong Zhang, Jingwei Liu, Bin Cui• 2024

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet 256x256
Inception Score (IS)298.5
441
Image GenerationCelebA 64 x 64 (test)
FID1.16
203
Unconditional Image GenerationCIFAR-10
FID1.54
171
Conditional Image GenerationCIFAR-10
FID1.47
71
Image GenerationFFHQ 64x64 (test)
FID1.71
69
Image GenerationOxford Flowers
FID18.18
15
Image GenerationCUB200
FID4.69
10
Image GenerationFood
FID5.74
8
Image GenerationSUN
FID7.35
8
Image GenerationDF-20M
FID15.12
8
Showing 10 of 13 rows

Other info

Follow for update