Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

InstanceDiffusion: Instance-level Control for Image Generation

About

Text-to-image diffusion models produce high quality images but do not offer control over individual instances in the image. We introduce InstanceDiffusion that adds precise instance-level control to text-to-image diffusion models. InstanceDiffusion supports free-form language conditions per instance and allows flexible ways to specify instance locations such as simple single points, scribbles, bounding boxes or intricate instance segmentation masks, and combinations thereof. We propose three major changes to text-to-image models that enable precise instance-level control. Our UniFusion block enables instance-level conditions for text-to-image models, the ScaleU block improves image fidelity, and our Multi-instance Sampler improves generations for multiple instances. InstanceDiffusion significantly surpasses specialized state-of-the-art models for each location condition. Notably, on the COCO dataset, we outperform previous state-of-the-art by 20.4% AP$_{50}^\text{box}$ for box inputs, and 25.4% IoU for mask inputs.

Xudong Wang, Trevor Darrell, Sai Saketh Rambhatla, Rohit Girdhar, Ishan Misra• 2024

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationT2I-CompBench
Shape Fidelity44.72
94
Compositional Image GenerationCOCO-MIG L2
Instance Attr Success Ratio68.24
14
Compositional Image GenerationCOCO-MIG L3
Instance Attribute Success Ratio60.47
14
Compositional Image GenerationCOCO-MIG L4
Instance Attribute Success Ratio59.88
14
Compositional Image GenerationCOCO-MIG L5
Instance Attribute Success Ratio53.92
14
Compositional Image GenerationCOCO-MIG L6
Instance Attribute Success Ratio0.5714
14
Compositional Image GenerationCOCO-MIG Avg
Instance Attribute Success Ratio58.49
14
Multi-Instance GenerationDEIG-Bench
MAAhuman (C1)61
10
Scribble-to-image generationCOCO (val)
FID26.4
10
Instance-controlled Image GenerationInstDiff-Bench
AP40
9
Showing 10 of 31 rows

Other info

Code

Follow for update