Membership Inference Attacks Against Fine-tuned Diffusion Language Models

About

Diffusion Language Models (DLMs) represent a promising alternative to autoregressive language models, using bidirectional masked token prediction. Yet their susceptibility to privacy leakage via Membership Inference Attacks (MIA) remains critically underexplored. This paper presents the first systematic investigation of MIA vulnerabilities in DLMs. Unlike the autoregressive models' single fixed prediction pattern, DLMs' multiple maskable configurations exponentially increase attack opportunities. This ability to probe many independent masks dramatically improves detection chances. To exploit this, we introduce SAMA (Subset-Aggregated Membership Attack), which addresses the sparse signal challenge through robust aggregation. SAMA samples masked subsets across progressive densities and applies sign-based statistics that remain effective despite heavy-tailed noise. Through inverse-weighted aggregation prioritizing sparse masks' cleaner signals, SAMA transforms sparse memorization detection into a robust voting mechanism. Experiments on nine datasets show SAMA achieves 30% relative AUC improvement over the best baseline, with up to 8 times improvement at low false positive rates. These findings reveal significant, previously unknown vulnerabilities in DLMs, necessitating the development of tailored privacy defenses.

Yuetian Chen, Kaiyuan Zhang, Yuntao Du, Edoardo Stoppa, Charles Fleming, Ashish Kundu, Bruno Ribeiro, Ninghui Li• 2026

Related benchmarks

Task	Dataset	Result
Membership Inference Attack	Pile-CC	TPR @ 1%0.115	61
Membership Inference Attack	arXiv	AUC85	55
Membership Inference Attack	AG News (test)	AUC0.673	43
Membership Inference Attack	XSum (test)	AUC0.682	43
Membership Inference Attack	GitHub	AUC0.876	32
Membership Inference Attack	HackerNews	AUC0.657	26
Membership Inference Attack	PubMed Central	AUC0.814	26
Membership Inference Attack	Wikipedia en	AUC0.79	26
Membership Inference Attack	WikiText-103 (test)	AUC0.782	13
Membership Inference	MIMIR arXiv 1.0 (test)	AUC-ROC0.783	6

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord