Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

LCM-LoRA: A Universal Stable-Diffusion Acceleration Module

About

Latent Consistency Models (LCMs) have achieved impressive performance in accelerating text-to-image generative tasks, producing high-quality images with minimal inference steps. LCMs are distilled from pre-trained latent diffusion models (LDMs), requiring only ~32 A100 GPU training hours. This report further extends LCMs' potential in two aspects: First, by applying LoRA distillation to Stable-Diffusion models including SD-V1.5, SSD-1B, and SDXL, we have expanded LCM's scope to larger models with significantly less memory consumption, achieving superior image generation quality. Second, we identify the LoRA parameters obtained through LCM distillation as a universal Stable-Diffusion acceleration module, named LCM-LoRA. LCM-LoRA can be directly plugged into various Stable-Diffusion fine-tuned models or LoRAs without training, thus representing a universally applicable accelerator for diverse image generation tasks. Compared with previous numerical PF-ODE solvers such as DDIM, DPM-Solver, LCM-LoRA can be viewed as a plug-in neural PF-ODE solver that possesses strong generalization abilities. Project page: https://github.com/luosiallen/latent-consistency-model.

Simian Luo, Yiqin Tan, Suraj Patil, Daniel Gu, Patrick von Platen, Apolin\'ario Passos, Longbo Huang, Jian Li, Hang Zhao• 2023

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationMS-COCO 2014 (val)--
128
Text-to-Image SynthesisMSCOCO
FID23.62
31
Text-to-Image GenerationStable Diffusion v1.5
FID (5k)15.63
27
Text-to-Image GenerationSD 1.5
FID35.48
9
Text-to-image upsamplingreLAION-400M 5k (test)
Latency (s/img)0.72
7
Image GenerationLAION-5B
FID13.31
6
Image GenerationCOCO 2014
FID19.74
6
Image GenerationSDXL
FID9.42
6
Text-to-Image GenerationLAION-5B
FID15.28
6
Text-to-Image GenerationCOCO 2014
FID23.49
6
Showing 10 of 10 rows

Other info

Follow for update