Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

The Determinism of Randomness: Latent Space Degeneracy in Diffusion Model

About

Diffusion models draw the initial latent from an isotropic Gaussian distribution (all directions equally likely). But in practice, changing only the random seed can sharply alter image quality and prompt faithfulness. We explain this by distinguishing the isotropic prior from the semantics induced by the sampling map: while the prior is direction-agnostic, the mapping from latent noise to semantics has semantic-invariant directions and semantic-sensitive directions, so different seeds can lead to very different semantic outcomes. Motivated by this view, we propose a training-free inference procedure that (i) suppresses seed-specific, semantic-irrelevant variation via distribution-preserving semantic erasure, (ii) reinforces prompt-relevant semantic directions through timestep-aggregated horizontal injection, and (iii) applies a simple spherical retraction to stay near the prior's typical set. Across multiple backbones and benchmarks, our method consistently improves alignment and generation quality over standard sampling.

Song Yan, Chenfeng Wang, Wei Zhai, Xinliang Bi, Jian Yang, Yusen Zhang, Yunwei Lan, Tao Zhang, GuanYe Xiong, Min Li, Zheng-Jun Zha• 2025

Related benchmarks

TaskDatasetResultRank
Video GenerationVBench--
102
Text-to-Image GenerationPick-a-Pic
PickScore17.5612
47
Image-to-3DToys4k
FD (Inception)29.4028
8
Text-to-Image GenerationDrawBench
PickScore17.597
7
Text-to-Image GenerationHPD
PickScore16.8347
7
Text-to-3D GenerationUser Study Text to 3D
Detailed Objects Score58.6
2
Text-to-Image GenerationUser Study T2I
Basic Objects & Colors92.5
2
Text-to-Video GenerationUser Study T2V
Dynamic Scenes Score0.664
2
Showing 8 of 8 rows

Other info

Follow for update