Large Language Models are Advanced Anonymizers
About
Recent privacy research on large language models (LLMs) has shown that they achieve near-human-level performance at inferring personal data from online texts. With ever-increasing model capabilities, existing text anonymization methods are currently lacking behind regulatory requirements and adversarial threats. In this work, we take two steps to bridge this gap: First, we present a new setting for evaluating anonymization in the face of adversarial LLM inferences, allowing for a natural measurement of anonymization performance while remedying some of the shortcomings of previous metrics. Then, within this setting, we develop a novel LLM-based adversarial anonymization framework leveraging the strong inferential capabilities of LLMs to inform our anonymization procedure. We conduct a comprehensive experimental evaluation of adversarial anonymization across 13 LLMs on real-world and synthetic online texts, comparing it against multiple baselines and industry-grade anonymizers. Our evaluation shows that adversarial anonymization outperforms current commercial anonymizers both in terms of the resulting utility and privacy. We support our findings with a human study (n=50) highlighting a strong and consistent human preference for LLM-anonymized texts.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Attribute Inference | Synthetic dataset | Attribute Inference Accuracy28.76 | 33 | |
| Attribute Inference | Real-world Reddit-derived dataset 1.0 (test) | Attribute Inference Accuracy42.86 | 30 | |
| Text Anonymization | SynthPAI | Privacy64 | 22 | |
| Text Anonymization | DB-Bio | Privacy Score78 | 17 | |
| Text Anonymization | MedQA | Privacy Score24.4 | 16 | |
| Text Anonymization | PUPA | Privacy Score94.2 | 16 | |
| Text Anonymization | TAB | Privacy Score62.1 | 16 | |
| Text Anonymization | PersonalReddit | Privacy Score36.5 | 14 | |
| Text Anonymization | reddit-self-disclosure | Utility Score0.9218 | 8 | |
| Text Anonymization | Human evaluation | PPP7.5 | 5 |