Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models
About
Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained significant attention from the community. These models can be easily customized for new concepts using low-rank adaptations (LoRAs). However, the utilization of multiple concept LoRAs to jointly support multiple customized concepts presents a challenge. We refer to this scenario as decentralized multi-concept customization, which involves single-client concept tuning and center-node concept fusion. In this paper, we propose a new framework called Mix-of-Show that addresses the challenges of decentralized multi-concept customization, including concept conflicts resulting from existing single-client LoRA tuning and identity loss during model fusion. Mix-of-Show adopts an embedding-decomposed LoRA (ED-LoRA) for single-client tuning and gradient fusion for the center node to preserve the in-domain essence of single concepts and support theoretically limitless concept fusion. Additionally, we introduce regionally controllable sampling, which extends spatially controllable sampling (e.g., ControlNet and T2I-Adaptor) to address attribute binding and missing object problems in multi-concept sampling. Extensive experiments demonstrate that Mix-of-Show is capable of composing multiple customized concepts with high fidelity, including characters, objects, and scenes.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Concept Image Generation | 12-concept dataset | Text Alignment0.631 | 26 | |
| Single-character story generation | User Study | C-A Score3.94 | 13 | |
| Single-character story generation | Pororo | D-I60.08 | 13 | |
| Single-character story generation | Frozen | D-I49.02 | 13 | |
| Multi-character story visualization | Pororo Multi-character | D-I52.62 | 8 | |
| Multi-character story visualization | Frozen Multi-character | D-I Score41.28 | 8 | |
| Multi-character story visualization | User Study Multi-character | C-A Score3.04 | 8 | |
| Personalized Image Generation | User Study 138 questions 1.0 (test) | Original Behavior Consistency8.5 | 8 | |
| Concept Customization | DreamBenchCC Instance | CLIP-I Score (target)72 | 7 | |
| Concept Customization | DreamBenchCC Style | CSD62 | 7 |