Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Few-Shot Image Generation by Conditional Relaxing Diffusion Inversion

About

In the field of Few-Shot Image Generation (FSIG) using Deep Generative Models (DGMs), accurately estimating the distribution of target domain with minimal samples poses a significant challenge. This requires a method that can both capture the broad diversity and the true characteristics of the target domain distribution. We present Conditional Relaxing Diffusion Inversion (CRDI), an innovative `training-free' approach designed to enhance distribution diversity in synthetic image generation. Distinct from conventional methods, CRDI does not rely on fine-tuning based on only a few samples. Instead, it focuses on reconstructing each target image instance and expanding diversity through few-shot learning. The approach initiates by identifying a Sample-wise Guidance Embedding (SGE) for the diffusion model, which serves a purpose analogous to the explicit latent codes in certain Generative Adversarial Network (GAN) models. Subsequently, the method involves a scheduler that progressively introduces perturbations to the SGE, thereby augmenting diversity. Comprehensive experiments demonstrates that our method surpasses GAN-based reconstruction techniques and equals state-of-the-art (SOTA) FSIG methods in performance. Additionally, it effectively mitigates overfitting and catastrophic forgetting, common drawbacks of fine-tuning approaches.

Yu Cao, Shaogang Gong• 2024

Related benchmarks

TaskDatasetResultRank
Few-shot Image GenerationSunglasses 10-shot
FID24.62
36
Few-shot Image GenerationBabies 10-shot
FID48.52
35
Few-shot Image GenerationAFHQ-Dog 10-shot
FID54.35
34
Few-shot Image GenerationMetFaces 10-shot
FID51.28
34
Few-shot Image GenerationAFHQ-Wild 10-shot
FID68.31
34
Few-shot Image GenerationAFHQ-Cat 10-shot
FID65.3
34
Few-shot Image GenerationSketches 10-shot
FID36.59
18
Showing 7 of 7 rows

Other info

Follow for update