Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-Discovering Interpretable Diffusion Latent Directions for Responsible Text-to-Image Generation

About

Diffusion-based models have gained significant popularity for text-to-image generation due to their exceptional image-generation capabilities. A risk with these models is the potential generation of inappropriate content, such as biased or harmful images. However, the underlying reasons for generating such undesired content from the perspective of the diffusion model's internal representation remain unclear. Previous work interprets vectors in an interpretable latent space of diffusion models as semantic concepts. However, existing approaches cannot discover directions for arbitrary concepts, such as those related to inappropriate concepts. In this work, we propose a novel self-supervised approach to find interpretable latent directions for a given concept. With the discovered vectors, we further propose a simple approach to mitigate inappropriate generation. Extensive experiments have been conducted to verify the effectiveness of our mitigation approach, namely, for fair generation, safe generation, and responsible text-enhancing generation. Project page: \url{https://interpretdiffusion.github.io}.

Hang Li, Chengzhi Shen, Philip Torr, Volker Tresp, Jindong Gu• 2023

Related benchmarks

TaskDatasetResultRank
Concept UnlearningUnlearnDiffAtk
UnlearnDiffAtk0.697
36
Text-to-Image GenerationCOCO 30k
FID15.98
29
Fair GenerationWinoBias Gender-Pro extended
Deviation Ratio0.07
20
Fair GenerationWinoBias Race (standard)
Deviation Ratio0.04
20
Fair GenerationWinoBias Gender (standard)
Deviation Ratio96
20
Fair GenerationWinoBias Race-Pro (extended)
Deviation Ratio0.08
20
Safe Text-to-Image GenerationMMA-Diffusion
Automatic Safety Rate90.7
20
Concept UnlearningRing-a-Bell
Ring-A-Bell Score69.6
20
Text-to-Image GenerationNon-targeted concepts
CLIP Score30.5
18
Concept UnlearningI2P
I2P0.27
17
Showing 10 of 28 rows

Other info

Code

Follow for update