FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models

About

Editing real images using a pre-trained text-to-image (T2I) diffusion/flow model often involves inverting the image into its corresponding noise map. However, inversion by itself is typically insufficient for obtaining satisfactory results, and therefore many methods additionally intervene in the sampling process. Such methods achieve improved results but are not seamlessly transferable between model architectures. Here, we introduce FlowEdit, a text-based editing method for pre-trained T2I flow models, which is inversion-free, optimization-free and model agnostic. Our method constructs an ODE that directly maps between the source and target distributions (corresponding to the source and target text prompts) and achieves a lower transport cost than the inversion approach. This leads to state-of-the-art results, as we illustrate with Stable Diffusion 3 and FLUX. Code and examples are available on the project's webpage.

Vladimir Kulikov, Matan Kleiner, Inbar Huberman-Spiegelglas, Tomer Michaeli• 2024

Related benchmarks

Task	Dataset	Result
Image Editing	PIE-Bench	PSNR32.68	257
Image Editing	PIE-Bench (test)	PSNR22.22	55
Text-Guided Image Editing	PIE-Bench	CLIP Similarity (Whole)26.43	40
Image Editing	PIE-Bench	PSNR22.17	25
Image Editing	PIE	Distance12.73	18
Image Editing	EditEval v2	LPIPS0.3921	14
Video Editing	71 Video Editing Tasks	Text Adherence Score3.85	14
Image Editing	1024 x 1024 resolution	Runtime (4090, s)101.1	14
Image Editing	SNR-Bench 1.0 (test)	Reward Model Structural Score3.38	12
Image Editing	DIV2K	Distance18.27	12

Showing 10 of 45 rows

Other info

Follow for update

@wizwand_team Discord