Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LoRACLR: Contrastive Adaptation for Customization of Diffusion Models

About

Recent advances in text-to-image customization have enabled high-fidelity, context-rich generation of personalized images, allowing specific concepts to appear in a variety of scenarios. However, current methods struggle with combining multiple personalized models, often leading to attribute entanglement or requiring separate training to preserve concept distinctiveness. We present LoRACLR, a novel approach for multi-concept image generation that merges multiple LoRA models, each fine-tuned for a distinct concept, into a single, unified model without additional individual fine-tuning. LoRACLR uses a contrastive objective to align and merge the weight spaces of these models, ensuring compatibility while minimizing interference. By enforcing distinct yet cohesive representations for each concept, LoRACLR enables efficient, scalable model composition for high-quality, multi-concept image synthesis. Our results highlight the effectiveness of LoRACLR in accurately merging multiple concepts, advancing the capabilities of personalized image generation.

Enis Simsar, Thomas Hofmann, Federico Tombari, Pinar Yanardag• 2024

Related benchmarks

TaskDatasetResultRank
Multi-Concept Image Generation12-concept dataset
Text Alignment0.668
26
Text-to-Image PersonalizationConcepts dataset
CLIP-I Score0.674
14
Multi-concept Generation32 concepts
DINO0.434
5
Multi-Concept Image GenerationUser Study
Identity Alignment3.42
4
Multi-Concept Image GenerationMulti-concept generation evaluation set
Accuracy (Avg)72.4
4
Showing 5 of 5 rows

Other info

Code

Follow for update