Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DiffusionCLIP: Text-Guided Diffusion Models for Robust Image Manipulation

About

Recently, GAN inversion methods combined with Contrastive Language-Image Pretraining (CLIP) enables zero-shot image manipulation guided by text prompts. However, their applications to diverse real images are still difficult due to the limited GAN inversion capability. Specifically, these approaches often have difficulties in reconstructing images with novel poses, views, and highly variable contents compared to the training data, altering object identity, or producing unwanted image artifacts. To mitigate these problems and enable faithful manipulation of real images, we propose a novel method, dubbed DiffusionCLIP, that performs text-driven image manipulation using diffusion models. Based on full inversion capability and high-quality image generation power of recent diffusion models, our method performs zero-shot image manipulation successfully even between unseen domains and takes another step towards general application by manipulating images from a widely varying ImageNet dataset. Furthermore, we propose a novel noise combination method that allows straightforward multi-attribute manipulation. Extensive experiments and human evaluation confirmed robust and superior manipulation performance of our methods compared to the existing baselines. Code is available at https://github.com/gwang-kim/DiffusionCLIP.git.

Gwanghyun Kim, Taesung Kwon, Jong Chul Ye• 2021

Related benchmarks

TaskDatasetResultRank
Face image reconstructionCelebA-HQ (test)
MAE0.02
13
Affective Image FilterAIF
SSIM53.05
11
Text-driven Image ManipulationCelebA-HQ (test)
Accuracy1.3
10
Sad Facial Attribute EditingCelebA-HQ (test)
Sdir0.163
8
Smiling Facial Attribute EditingCelebA-HQ (test)
Sdir0.17
8
Tanned Facial Attribute EditingCelebA-HQ (test)
Sdir0.174
8
Affective Image FilteringUser Study (test)
EPS (%)11.72
6
Image EditingLSUN-Church Ancient (test)
Sdir0.1976
6
Image EditingLSUN-Church Red Brick (test)
Sdir0.2085
6
Image EditingLSUN-Church Department Store (test)
Sdir0.13
6
Showing 10 of 20 rows

Other info

Code

Follow for update