Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Latent Diffusion Inversion Requires Understanding the Latent Space

About

The recovery of training data from generative models ("model inversion") has been extensively studied for diffusion models in the data domain as a memorization/overfitting phenomenon. Latent diffusion models (LDMs), which operate on the latent codes from encoder/decoder pairs, have been robust to prior inversion methods. In this work we describe two key findings: (1) the diffusion model exhibits non-uniform memorization across latent codes, tending to overfit samples located in high-distortion regions of the decoder pullback metric; (2) even within a single latent code, memorization contributions are unequal across representation dimensions. Our proposed method to ranks latent dimensions by their contribution to the decoder pullback metric, which in turn identifies dimensions that contribute to memorization. For score-based membership inference, a sub-task of model inversion, we find that removing less-memorizing dimensions improves performance on all tested methods and datasets, with average AUROC gains of 1-4% and substantial increases in TPR@1%FPR (1-32%) across diverse datasets including CIFAR-10, CelebA, ImageNet-1K, Pokemon, MS-COCO, and Flickr. Our results highlight the overlooked influence of the auto-encoder geometry on LDM memorization and provide a new perspective for analyzing privacy risks in diffusion-based generative models.

Mingxing Rao, Bowen Qu, Daniel Moyer• 2025

Related benchmarks

TaskDatasetResultRank
Membership Inference AttackCIFAR-10
AUC91.26
107
Membership Inference AttackFlickr (test)
AUC74.16
21
Membership Inference AttackCelebA
AUC88.18
9
Membership Inference AttackImageNet
AUC72.55
9
Membership Inference AttackPokemon (test)
AUC96.23
9
Membership Inference AttackMS-COCO (test)
AUC96.86
9
Showing 6 of 6 rows

Other info

Follow for update