KV-Edit: Training-Free Image Editing for Precise Background Preservation

About

Background consistency remains a significant challenge in image editing tasks. Despite extensive developments, existing works still face a trade-off between maintaining similarity to the original image and generating content that aligns with the target. Here, we propose KV-Edit, a training-free approach that uses KV cache in DiTs to maintain background consistency, where background tokens are preserved rather than regenerated, eliminating the need for complex mechanisms or expensive training, ultimately generating new content that seamlessly integrates with the background within user-provided regions. We further explore the memory consumption of the KV cache during editing and optimize the space complexity to $O(1)$ using an inversion-free method. Our approach is compatible with any DiT-based generative model without additional training. Experiments demonstrate that KV-Edit significantly outperforms existing approaches in terms of both background and image quality, even surpassing training-based methods. Project webpage is available at https://xilluill.github.io/projectpages/KV-Edit

Tianrui Zhu, Shiyi Zhang, Jiawei Shao, Yansong Tang• 2025

Related benchmarks

Task	Dataset	Result
Image Editing	PIE-Bench	PSNR33.45	215
Image Semantic Editing	PIE-Bench (test)	PSNR35.87	18
Image Editing	ReshapeBench	AS6.51	10
Video Object Removal	DAVIS	TokSim28.68	10
Video Object Removal	WIPER-Bench	TokSim23.26	9
Image Editing	PIE-Bench random class	Quality Score71.8	5
Multi-layer Image Editing	LayerEditBench Multi-layer	Edited Region HPS21.75	3
Single-layer Image Editing	LayerEditBench Single-layer	Foreground HPS15.57	3

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord