Stable Flow: Vital Layers for Training-Free Image Editing

About

Diffusion models have revolutionized the field of content synthesis and editing. Recent models have replaced the traditional UNet architecture with the Diffusion Transformer (DiT), and employed flow-matching for improved training and sampling. However, they exhibit limited generation diversity. In this work, we leverage this limitation to perform consistent image edits via selective injection of attention features. The main challenge is that, unlike the UNet-based models, DiT lacks a coarse-to-fine synthesis structure, making it unclear in which layers to perform the injection. Therefore, we propose an automatic method to identify "vital layers" within DiT, crucial for image formation, and demonstrate how these layers facilitate a range of controlled stable edits, from non-rigid modifications to object addition, using the same mechanism. Next, to enable real-image editing, we introduce an improved image inversion method for flow models. Finally, we evaluate our approach through qualitative and quantitative comparisons, along with a user study, and demonstrate its effectiveness across multiple applications. The project page is available at https://omriavrahami.com/stable-flow

Omri Avrahami, Or Patashnik, Ohad Fried, Egor Nemchinov, Kfir Aberman, Dani Lischinski, Daniel Cohen-Or• 2024

Related benchmarks

Task	Dataset	Result
Text-driven Image Editing	Dedicated evaluation dataset 88 concept pairs	CLIP Image Fidelity83.24	7
Text-driven Image Editing	COCO-based (test)	CLIPtxt0.23	6
Text-Guided Image Editing	Image Editing (test)	Text Following83.33	6
Non-rigid image editing	Non-Rigid Editing Benchmark	GPT-4o Score6.6417	6
Non-rigid image editing	PIE-Bench ChangePose	GPT-4o Score4.8083	6
Text-driven Image Editing	COCO User Study	Prompt Adherence82.33	5
text+structure to image generation	MoCA	NIQE2.707	4
Image Editing	Image Editing Prompts (400 samples)	CLIP Similarity (Image)96.42	2

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord