Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Aligned Stable Inpainting: Mitigating Unwanted Object Insertion and Preserving Color Consistency

About

Generative image inpainting can produce realistic, high-fidelity results even with large, irregular masks. However, existing methods still face key issues that make inpainted images look unnatural. In this paper, we identify two main problems: (1) Unwanted object insertion: generative models may hallucinate arbitrary objects in the masked region that do not match the surrounding context. (2) Color inconsistency: inpainted regions often exhibit noticeable color shifts, leading to smeared textures and degraded image quality. We analyze the underlying causes of these issues and propose efficient post-hoc solutions for pre-trained inpainting models. Specifically, we introduce the principled framework of Aligned Stable inpainting with UnKnown Areas prior (ASUKA). To reduce unwanted object insertion, we use reconstruction-based priors to guide the generative model, suppressing hallucinated objects while preserving generative flexibility. To address color inconsistency, we design a specialized VAE decoder that formulates latent-to-image decoding as a local harmonization task. This design significantly reduces color shifts and produces more color-consistent results. We implement ASUKA on two representative inpainting architectures: a U-Net-based model and a DiT-based model. We analyze and propose lightweight injection strategies that minimize interference with the model's original generation capacity while ensuring the mitigation of the two issues. We evaluate ASUKA using the Places2 dataset and MISATO, our proposed diverse benchmark. Experiments show that ASUKA effectively suppresses object hallucination and improves color consistency, outperforming standard diffusion, rectified flow models, and other inpainting methods. Dataset, models and codes will be released in github.

Yikai Wang, Junqiu Yu, Chenjie Cao, Xiangyang Xue, Yanwei Fu• 2026

Related benchmarks

TaskDatasetResultRank
Image InpaintingFFHQ (test)
FID1.844
40
Image InpaintingMISATO @512 (test)
LPIPS0.139
17
InpaintingPlaces 2 (val)
LPIPS0.174
15
Image InpaintingUser Study 40 random images (test)
UOM32.88
12
Image InpaintingMISATO User Study 1.0 (test)
UOM39.43
9
Image InpaintingCelebA-HQ (test)
LPIPS0.126
6
Object Hallucination EvaluationMISATO 512 resolution
VLM Judgment136
5
Image InpaintingMISATO@1K (test)
LPIPS0.156
4
Showing 8 of 8 rows

Other info

Follow for update