Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FreeInpaint: Tuning-free Prompt Alignment and Visual Rationality Enhancement in Image Inpainting

About

Text-guided image inpainting endeavors to generate new content within specified regions of images using textual prompts from users. The primary challenge is to accurately align the inpainted areas with the user-provided prompts while maintaining a high degree of visual fidelity. While existing inpainting methods have produced visually convincing results by leveraging the pre-trained text-to-image diffusion models, they still struggle to uphold both prompt alignment and visual rationality simultaneously. In this work, we introduce FreeInpaint, a plug-and-play tuning-free approach that directly optimizes the diffusion latents on the fly during inference to improve the faithfulness of the generated images. Technically, we introduce a prior-guided noise optimization method that steers model attention towards valid inpainting regions by optimizing the initial noise. Furthermore, we meticulously design a composite guidance objective tailored specifically for the inpainting task. This objective efficiently directs the denoising process, enhancing prompt alignment and visual rationality by optimizing intermediate latents at each step. Through extensive experiments involving various inpainting diffusion models and evaluation metrics, we demonstrate the effectiveness and robustness of our proposed FreeInpaint.

Chao Gong, Dong Li, Yingwei Pan, Jingjing Chen, Ting Yao, Tao Mei• 2025

Related benchmarks

TaskDatasetResultRank
Image InpaintingEditBench free-form masks (val)
ImageReward0.5248
15
Text-guided image inpaintingMSCOCO with layout masks (test)
ImageReward0.3422
15
Showing 2 of 2 rows

Other info

Follow for update