Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Steering Rectified Flow Models in the Vector Field for Controlled Image Generation

About

Diffusion models (DMs) excel in photorealism, image editing, and solving inverse problems, aided by classifier-free guidance and image inversion techniques. However, rectified flow models (RFMs) remain underexplored for these tasks. Existing DM-based methods often require additional training, lack generalization to pretrained latent models, underperform, and demand significant computational resources due to extensive backpropagation through ODE solvers and inversion processes. In this work, we first develop a theoretical and empirical understanding of the vector field dynamics of RFMs in efficiently guiding the denoising trajectory. Our findings reveal that we can navigate the vector field in a deterministic and gradient-free manner. Utilizing this property, we propose FlowChef, which leverages the vector field to steer the denoising trajectory for controlled image generation tasks, facilitated by gradient skipping. FlowChef is a unified framework for controlled image generation that, for the first time, simultaneously addresses classifier guidance, linear inverse problems, and image editing without the need for extra training, inversion, or intensive backpropagation. Finally, we perform extensive evaluations and show that FlowChef significantly outperforms baselines in terms of performance, memory, and time requirements, achieving new state-of-the-art results. Project Page: \url{https://flowchef.github.io}.

Maitreya Patel, Song Wen, Dimitris N. Metaxas, Yezhou Yang• 2024

Related benchmarks

TaskDatasetResultRank
InpaintingFFHQ 1k
PSNR32.07
14
InpaintingDIV2K 0.8k
PSNR25.35
14
Video EditingVPBench (test)
CLIP Score26.17
13
Image EditingHumanEdit 1024px
FID32.3
12
Image EditingInpaintCOCO 512px
FID46.6
12
Image InpaintingFFHQ DIV2K (val)
Latency (s)3
11
InpaintingDIV2K 768 x 768
FID (Half Crop)43.3
11
InpaintingFFHQ 768 x 768 5k samples
FID (Half)20.2
11
Image InpaintingPIE-Bench (556 samples)
FID68.3
11
Super-ResolutionDIV2K 0.8k
PSNR21.86
7
Showing 10 of 16 rows

Other info

Follow for update