Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

A Unified Perspective on Adversarial Membership Manipulation in Vision Models

About

Membership inference attacks (MIAs) aim to determine whether a specific data point was part of a model's training set, serving as effective tools for evaluating privacy leakage of vision models. However, existing MIAs implicitly assume honest query inputs, and their adversarial robustness remains unexplored. We show that MIAs for vision models expose a previously overlooked adversarial surface: adversarial membership manipulation, where imperceptible perturbations can reliably push non-member images into the "member" region of state-of-the-art MIAs. In this paper, we provide the first unified perspective on this phenomenon by analyzing its mechanism and implications. We begin by demonstrating that adversarial membership fabrication is consistently effective across diverse architectures and datasets. We then reveal a distinctive geometric signature - a characteristic gradient-norm collapse trajectory - that reliably separates fabricated from true members despite their nearly identical semantic representations. Building on this insight, we introduce a principled detection strategy grounded in gradient-geometry signals and develop a robust inference framework that substantially mitigates adversarial manipulation. Extensive experiments show that fabrication is broadly effective, while our detection and robust inference strategies significantly enhance resilience. This work establishes the first comprehensive framework for adversarial membership manipulation in vision models.

Ruize Gao, Kaiwen Zhou, Yongqiang Chen, Feng Liu• 2026

Related benchmarks

TaskDatasetResultRank
Membership Inference AttackCIFAR-10
AUC0.7937
48
Membership Inference AttackCIFAR-100
TPR @ 1% FPR11.6
18
Membership Inference AttackSVHN
AUC0.7273
16
Attack Robustness AnalysisCIFAR-10
AUC0.7871
8
Attack Robustness AnalysisCIFAR-100
AUC0.7881
8
Attack Robustness AnalysisSVHN
AUC71.85
8
Attack Robustness AnalysisCINIC-10
AUC81.67
8
Membership Inference AttackCINIC-10
AUC0.8166
8
Membership Inference AttackCIFAR-10 ||δ||∞ ≤ 4.0/255 (test)
AUC0.8065
8
Membership Inference AttackCIFAR-100 ||δ||∞ ≤ 4.0/255 (test)
AUC0.8285
8
Showing 10 of 14 rows

Other info

Follow for update