Mix-of-Show: Decentralized Low-Rank Adaptation for Multi-Concept Customization of Diffusion Models

About

Public large-scale text-to-image diffusion models, such as Stable Diffusion, have gained significant attention from the community. These models can be easily customized for new concepts using low-rank adaptations (LoRAs). However, the utilization of multiple concept LoRAs to jointly support multiple customized concepts presents a challenge. We refer to this scenario as decentralized multi-concept customization, which involves single-client concept tuning and center-node concept fusion. In this paper, we propose a new framework called Mix-of-Show that addresses the challenges of decentralized multi-concept customization, including concept conflicts resulting from existing single-client LoRA tuning and identity loss during model fusion. Mix-of-Show adopts an embedding-decomposed LoRA (ED-LoRA) for single-client tuning and gradient fusion for the center node to preserve the in-domain essence of single concepts and support theoretically limitless concept fusion. Additionally, we introduce regionally controllable sampling, which extends spatially controllable sampling (e.g., ControlNet and T2I-Adaptor) to address attribute binding and missing object problems in multi-concept sampling. Extensive experiments demonstrate that Mix-of-Show is capable of composing multiple customized concepts with high fidelity, including characters, objects, and scenes.

Yuchao Gu, Xintao Wang, Jay Zhangjie Wu, Yujun Shi, Yunpeng Chen, Zihan Fan, Wuyou Xiao, Rui Zhao, Shuning Chang, Weijia Wu, Yixiao Ge, Ying Shan, Mike Zheng Shou• 2023

Related benchmarks

Task	Dataset	Result
Multi-Concept Image Generation	12-concept dataset	Text Alignment0.631	26
Text-to-Image Personalization	Concepts dataset	CLIP-I Score0.677	14
Single-character story generation	User Study	C-A Score3.94	13
Single-character story generation	Pororo	D-I60.08	13
Single-character story generation	Frozen	D-I49.02	13
Multi-subject Text-to-Image Generation	Orthogonal Adaptation concept bank	Text Alignment Score (Single)62.5	10
Multi-character story visualization	Pororo Multi-character	D-I52.62	8
Multi-character story visualization	Frozen Multi-character	D-I Score41.28	8
Multi-character story visualization	User Study Multi-character	C-A Score3.04	8
Personalized Image Generation	User Study 138 questions 1.0 (test)	Original Behavior Consistency8.5	8

Showing 10 of 20 rows

Other info

Follow for update

@wizwand_team Discord