Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Does FLUX Already Know How to Perform Physically Plausible Image Composition?

About

Image composition aims to seamlessly insert a user-specified object into a new scene, but existing models struggle with complex lighting (e.g., accurate shadows, water reflections) and diverse, high-resolution inputs. Modern text-to-image diffusion models (e.g., SD3.5, FLUX) already encode essential physical and resolution priors, yet lack a framework to unleash them without resorting to latent inversion, which often locks object poses into contextually inappropriate orientations, or brittle attention surgery. We propose SHINE, a training-free framework for Seamless, High-fidelity Insertion with Neutralized Errors. SHINE introduces manifold-steered anchor loss, leveraging pretrained customization adapters (e.g., IP-Adapter) to guide latents for faithful subject representation while preserving background integrity. Degradation-suppression guidance and adaptive background blending are proposed to further eliminate low-quality outputs and visible seams. To address the lack of rigorous benchmarks, we introduce ComplexCompo, featuring diverse resolutions and challenging conditions such as low lighting, strong illumination, intricate shadows, and reflective surfaces. Experiments on ComplexCompo and DreamEditBench show state-of-the-art performance on standard metrics (e.g., DINOv2) and human-aligned scores (e.g., DreamSim, ImageReward, VisionReward). Code is available at https://github.com/ZhumingLian/SHINE.

Shilin Lu, Zhuming Lian, Zihan Zhou, Shaocong Zhang, Chen Zhao, Adams Wai-Kin Kong• 2025

Related benchmarks

TaskDatasetResultRank
Compositional Image GenerationComplexCompo 300
CLIP-I0.7999
20
Image CompositionDreamEditBench 220
CLIP-I0.8125
14
Image CompositionUser Study
Average Ranking1.52
13
Image EditingDreamEdit-Bench 220
HPSv38.8861
13
Image EditingComplex-Compo 300
HPSv39.8418
13
Image CompositionResolution Benchmark 512 x 512
Latency (s)18.08
13
Compositional Image GenerationDreamEditBench 220
CLIP-I0.8125
6
Showing 7 of 7 rows

Other info

Follow for update