Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MIGC: Multi-Instance Generation Controller for Text-to-Image Synthesis

About

We present a Multi-Instance Generation (MIG) task, simultaneously generating multiple instances with diverse controls in one image. Given a set of predefined coordinates and their corresponding descriptions, the task is to ensure that generated instances are accurately at the designated locations and that all instances' attributes adhere to their corresponding description. This broadens the scope of current research on Single-instance generation, elevating it to a more versatile and practical dimension. Inspired by the idea of divide and conquer, we introduce an innovative approach named Multi-Instance Generation Controller (MIGC) to address the challenges of the MIG task. Initially, we break down the MIG task into several subtasks, each involving the shading of a single instance. To ensure precise shading for each instance, we introduce an instance enhancement attention mechanism. Lastly, we aggregate all the shaded instances to provide the necessary information for accurately generating multiple instances in stable diffusion (SD). To evaluate how well generation models perform on the MIG task, we provide a COCO-MIG benchmark along with an evaluation pipeline. Extensive experiments were conducted on the proposed COCO-MIG benchmark, as well as on various commonly used benchmarks. The evaluation results illustrate the exceptional control capabilities of our model in terms of quantity, position, attribute, and interaction. Code and demos will be released at https://migcproject.github.io/.

Dewei Zhou, You Li, Fan Ma, Xiaoting Zhang, Yi Yang• 2024

Related benchmarks

TaskDatasetResultRank
Compositional Image GenerationCOCO-MIG L3
Instance Attribute Success Ratio63.1
14
Compositional Image GenerationCOCO-MIG L4
Instance Attribute Success Ratio61.27
14
Compositional Image GenerationCOCO-MIG L5
Instance Attribute Success Ratio57.25
14
Compositional Image GenerationCOCO-MIG L6
Instance Attribute Success Ratio0.5913
14
Compositional Image GenerationCOCO-MIG Avg
Instance Attribute Success Ratio60.41
14
Compositional Image GenerationCOCO-MIG L2
Instance Attr Success Ratio66.37
14
Layout-to-Image GenerationCOCO-Position 2014
AP54.69
12
Multi-Instance GenerationDEIG-Bench
MAAhuman (C1)60
10
Controllable Image GenerationMIG-Bench
mIoU (L2)64.29
9
Layout-controllable GenerationCOCO-MIG
SR27.75
9
Showing 10 of 22 rows

Other info

Code

Follow for update