UniCombine: Unified Multi-Conditional Combination with Diffusion Transformer
About
With the rapid development of diffusion models in image generation, the demand for more powerful and flexible controllable frameworks is increasing. Although existing methods can guide generation beyond text prompts, the challenge of effectively combining multiple conditional inputs while maintaining consistency with all of them remains unsolved. To address this, we introduce UniCombine, a DiT-based multi-conditional controllable generative framework capable of handling any combination of conditions, including but not limited to text prompts, spatial maps, and subject images. Specifically, we introduce a novel Conditional MMDiT Attention mechanism and incorporate a trainable LoRA module to build both the training-free and training-based versions. Additionally, we propose a new pipeline to construct SubjectSpatial200K, the first dataset designed for multi-conditional generative tasks covering both the subject-driven and spatially-aligned conditions. Extensive experimental results on multi-conditional generation demonstrate the outstanding universality and powerful capability of our approach with state-of-the-art performance.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Compositional Image Generation | ComplexCompo 300 | CLIP-I0.7361 | 20 | |
| Image Composition | DreamEditBench 220 | CLIP-I0.8058 | 14 | |
| Image Composition | User Study | Average Ranking2.94 | 13 | |
| Image Editing | DreamEdit-Bench 220 | HPSv38.8415 | 13 | |
| Image Editing | Complex-Compo 300 | HPSv38.8999 | 13 | |
| Image Composition | Resolution Benchmark 512 x 512 | Latency (s)11.98 | 13 | |
| Garment Generation | GarmentBench (test) | LLA0.635 | 8 | |
| Multi-condition Image Generation (Multi-Spatial) | Multi-Spatial Evaluation Set | FID67.4 | 6 | |
| Multi-condition Image Generation (Subject-Canny) | Subject-Canny (Evaluation Set) | FID61.03 | 4 | |
| Multi-condition Image Generation (Subject-Depth) | Subject-Depth Evaluation Set | FID70.22 | 4 |