Eye-for-an-eye: Appearance Transfer with Semantic Correspondence in Diffusion Models

About

As pre-trained text-to-image diffusion models have become a useful tool for image synthesis, people want to specify the results in various ways. This paper tackles training-free appearance transfer, which produces an image with the structure of a target image from the appearance of a reference image. Existing methods usually do not reflect semantic correspondence, as they rely on query-key similarity within the self-attention layer to establish correspondences between images. To this end, we propose explicitly rearranging the features according to the dense semantic correspondences. Extensive experiments show the superiority of our method in various aspects: preserving the structure of the target and reflecting the correct color from the reference, even when the two images are not aligned.

Sooyeon Go, Kyungmook Choi, Minjung Shin, Youngjung Uh• 2024

Related benchmarks

Task	Dataset	Result
Semantic-Aware Appearance Transfer	Curated dataset 100 image pairs	CLIP-I88.32	6
Overall Appearance Transfer Quality	Curated dataset 100 image pairs	DeQA3.6758	6
Material Transfer	Curated dataset 100 image pairs	CLIP-T Score0.2428	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord