ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying
About
Large Language Model (LLM) agents have achieved rapid adoption and demonstrated remarkable capabilities across a wide range of applications. To improve reasoning and task execution, modern LLM agents would incorporate memory modules or retrieval-augmented generation (RAG) mechanisms, enabling them to further leverage prior interactions or external knowledge. However, such a design also introduces a group of critical privacy vulnerabilities: sensitive information stored in memory can be leaked through query-based attacks. Although feasible, existing attacks often achieve only limited performance, with low attack success rates (ASR). In this paper, we propose ADAM, a novel privacy attack that features data distribution estimation of a victim agent's memory and employs an entropy-guided query strategy for maximizing privacy leakage. Extensive experiments demonstrate that our attack substantially outperforms state-of-the-art ones, achieving up to 100% ASRs. These results thus underscore the urgent need for robust privacy-preserving methods for current LLM agents.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Data Extraction Attack | EHRAgent | Equality (EQ)83 | 20 | |
| Data Extraction Attack | ReAct | EQ86 | 20 | |
| Data Extraction Attack | RAP | Equality (EQ)73 | 20 | |
| Data Extraction Attack on Agent Memory | EhrAgent (test) | Equality (EQ)82 | 12 | |
| Data Extraction Attack on Agent Memory | ReAct (test) | EQ Score81 | 12 | |
| Data Extraction Attack on Agent Memory | RAP (test) | Equality Score71 | 12 |