Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FreeDoM: Training-Free Energy-Guided Conditional Diffusion Model

About

Recently, conditional diffusion models have gained popularity in numerous applications due to their exceptional generation ability. However, many existing methods are training-required. They need to train a time-dependent classifier or a condition-dependent score estimator, which increases the cost of constructing conditional diffusion models and is inconvenient to transfer across different conditions. Some current works aim to overcome this limitation by proposing training-free solutions, but most can only be applied to a specific category of tasks and not to more general conditions. In this work, we propose a training-Free conditional Diffusion Model (FreeDoM) used for various conditions. Specifically, we leverage off-the-shelf pre-trained networks, such as a face detection model, to construct time-independent energy functions, which guide the generation process without requiring training. Furthermore, because the construction of the energy function is very flexible and adaptable to various conditions, our proposed FreeDoM has a broader range of applications than existing training-free methods. FreeDoM is advantageous in its simplicity, effectiveness, and low cost. Experiments demonstrate that FreeDoM is effective for various conditions and suitable for diffusion models of diverse data domains, including image and latent code domains.

Jiwen Yu, Yinhuai Wang, Chen Zhao, Bernard Ghanem, Jian Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Class-conditional Image GenerationImageNet
FID200
132
Conditional Image GenerationCIFAR-10
FID135
71
Text-to-Image GenerationPick-a-Pic (val)
PickScore22.13
20
Text-to-Image GenerationPick-a-Pic, HPSv2, and PartiPrompts (test)
PickScore22.13
12
Text-to-Image GenerationPick-a-Pic (500), HPSv2 (500), and PartiPrompts (1000) (test)
PickScore21.96
10
Text-to-Image SynthesisGenEval SD V1.5
Overall Score52
9
Conditional Image GenerationFine-grained Birds
Accuracy0.6
8
Conditional Image GenerationCelebA-HQ Gender+Age
Accuracy68.7
7
Conditional Image GenerationCelebA-HQ Gender+Hair
Accuracy67.1
7
Stylized Image GenerationSD prompts Stylized results 1.4
Style Loss10.21
4
Showing 10 of 15 rows

Other info

Follow for update