Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SenseFlow: Scaling Distribution Matching for Flow-based Text-to-Image Distillation

About

The Distribution Matching Distillation (DMD) has been successfully applied to text-to-image diffusion models such as Stable Diffusion (SD) 1.5. However, vanilla DMD suffers from convergence difficulties on large-scale flow-based text-to-image models, such as SD 3.5 and FLUX. In this paper, we first analyze the issues when applying vanilla DMD on large-scale models. Then, to overcome the scalability challenge, we propose implicit distribution alignment (IDA) to regularize the distance between the generator and fake distribution. Furthermore, we propose intra-segment guidance (ISG) to relocate the timestep importance distribution from the teacher model. With IDA alone, DMD converges for SD 3.5; employing both IDA and ISG, DMD converges for SD 3.5 and FLUX.1 dev. Along with other improvements such as scaled up discriminator models, our final model, dubbed \textbf{SenseFlow}, achieves superior performance in distillation for both diffusion based text-to-image models such as SDXL, and flow-matching models such as SD 3.5 Large and FLUX. The source code will be avaliable at https://github.com/XingtongGe/SenseFlow.

Xingtong Ge, Xin Zhang, Tongda Xu, Yi Zhang, Xinjie Zhang, Yan Wang, Jun Zhang• 2025

Related benchmarks

TaskDatasetResultRank
Text-to-Image GenerationGenEval
GenEval Score60
277
Text-to-Image GenerationDPG-Bench
DPG Score79.86
89
Text-to-Image GenerationOneIG-Bench
Alignment0.776
33
Text-to-Image GenerationMS-COCO 10K prompts 2014 (val)
FID34.1
19
Text-to-Image GenerationHPS prompt set v2
CLIP Score0.283
11
Text-to-Image GenerationAlign5000 1.0 (test)
CLIP Score0.311
9
Showing 6 of 6 rows

Other info

Follow for update