Score-based Membership Inference on Diffusion Models
About
Membership inference attacks (MIAs) against Diffusion Models (DMs) raise pressing privacy concerns by revealing whether a sample was part of the training set. While existing methods typically rely on measuring reconstruction error across multiple denoising steps as a test statistic, they often incur significant computational overhead. In this work, we present a simple yet successful attack statistic using only the predicted noise vectors from the DM's denoiser, or equivalently, the score. Specifically, we show that the expected denoiser output points toward a kernel-weighted local mean of nearby training samples, such that its norm encodes proximity to the training set and thereby reveals membership. Building on this observation, we propose SimA, a single-query attack that provides a principled, efficient alternative to existing multi-query methods. SimA consistently achieves superior performance across variants of DMs and the Latent Diffusion Models (LDMs) on eight different datasets. Its Monte Carlo variant (SimA-MC) exhibits state-of-the-art performance across all experiments, significantly outperforming baseline methods in terms of TPR@1%FPR. These results demonstrate that complex reconstruction trajectories are unnecessary for effective membership inference, establishing SimA as a highly efficient benchmark for auditing privacy in DMs and LDMs.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Membership Inference Attack | CIFAR-10 | AUC92.16 | 120 | |
| Membership Inference Attack | CIFAR-100 | TPR @ 1% FPR44.66 | 46 | |
| Membership Inference Attack | CelebA | AUC95.04 | 22 | |
| Membership Inference Attack | ImageNet | AUC71.13 | 15 | |
| Membership Inference Attack | STL10 U | ASR80.55 | 13 | |
| Membership Inference Attack | Pokémon lambdalabs blip-captions (fine-tuned) | AUC97.01 | 7 | |
| Membership Inference Attack | MS-COCO fine-tuned 2017 (val) | AUC94.24 | 7 | |
| Membership Inference Attack | Flickr30K (fine-tuned) | AUC72.23 | 7 | |
| Membership Inference Attack | ImageNet 1K V2 (train) | ASR85.73 | 7 |