Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Classifier-Free Guidance inside the Attraction Basin May Cause Memorization

About

Diffusion models are prone to exactly reproduce images from the training data. This exact reproduction of the training data is concerning as it can lead to copyright infringement and/or leakage of privacy-sensitive information. In this paper, we present a novel perspective on the memorization phenomenon and propose a simple yet effective approach to mitigate it. We argue that memorization occurs because of an attraction basin in the denoising process which steers the diffusion trajectory towards a memorized image. However, this can be mitigated by guiding the diffusion trajectory away from the attraction basin by not applying classifier-free guidance until an ideal transition point occurs from which classifier-free guidance is applied. This leads to the generation of non-memorized images that are high in image quality and well-aligned with the conditioning mechanism. To further improve on this, we present a new guidance technique, opposite guidance, that escapes the attraction basin sooner in the denoising process. We demonstrate the existence of attraction basins in various scenarios in which memorization occurs, and we show that our proposed approach successfully mitigates memorization.

Anubhav Jain, Yuya Kobayashi, Takashi Shibuya, Yuhta Takida, Nasir Memon, Julian Togelius, Yuki Mitsufuji• 2024

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationPokémon
CLIP Score33.39
21
Text-to-Image GenerationLAION-Art
SSCD0.82
18
Text-to-Image GenerationCelebA-HQ
SSCD0.75
18
Memorization mitigationStable Diffusion 1.4
Memorization Rate9.4
13
Text-to-Image GenerationWebster 500 Memorized Prompts 2023 v1.4 (430 with available target images)
SSCD (Target)0.1816
13
Mitigating memorization in conditional diffusion modelsScenario 3 duplicated prompts Stable Diffusion v1.4
Similarity (95pc)0.6915
8
Text-to-Image GenerationScenario 4
Similarity (95th Percentile)0.8722
8
Memorization mitigationStable Diffusion finetuned 1.4
Memorization Rate5.5
7
Memorization mitigationSD 1.4 (val)
Inference Time (s)3.426
7
Text-to-Image GenerationLAION-10k Scenario 1 (test)
Similarity (95pc)0.3811
7
Showing 10 of 14 rows

Other info

Code

Follow for update