Concept-Aware Privacy Mechanisms for Defending Embedding Inversion Attacks

About

Text embeddings enable numerous NLP applications but face severe privacy risks from embedding inversion attacks, which can expose sensitive attributes or reconstruct raw text. Existing differential privacy defenses assume uniform sensitivity across embedding dimensions, leading to excessive noise and degraded utility. We propose SPARSE, a user-centric framework for concept-specific privacy protection in text embeddings. SPARSE combines (1) differentiable mask learning to identify privacy-sensitive dimensions for user-defined concepts, and (2) the Mahalanobis mechanism that applies elliptical noise calibrated by dimension sensitivity. Unlike traditional spherical noise injection, SPARSE selectively perturbs privacy-sensitive dimensions while preserving non-sensitive semantics. Evaluated across six datasets with three embedding models and attack scenarios, SPARSE consistently reduces privacy leakage while achieving superior downstream performance compared to state-of-the-art DP methods.

Yu-Che Tsai, Hsiang Hsiao, Kuan-Yu Chen, Shou-De Lin• 2026

Related benchmarks

Task	Dataset	Result
Semantic Textual Similarity	STS 2014	--	39
Information Retrieval	NFCorpus	Leakage0.68	16
Privacy-utility tradeoff	STS12	Leakage4.34	16
Privacy-utility tradeoff	FiQA	Leakage8.48	16
Semantic Textual Similarity	STSB	Leakage2.68	16
Semantic Textual Similarity	STS12	Downstream Performance73.25	5

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord