ADAM: A Systematic Data Extraction Attack on Agent Memory via Adaptive Querying

Large Language Model (LLM) agents have achieved rapid adoption and demonstrated remarkable capabilities across a wide range of applications. To improve reasoning and task execution, modern LLM agents would incorporate memory modules or retrieval-augmented generation (RAG) mechanisms, enabling them to further leverage prior interactions or external knowledge. However, such a design also introduces a group of critical privacy vulnerabilities: sensitive information stored in memory can be leaked through query-based attacks. Although feasible, existing attacks often achieve only limited performance, with low attack success rates (ASR). In this paper, we propose ADAM, a novel privacy attack that features data distribution estimation of a victim agent's memory and employs an entropy-guided query strategy for maximizing privacy leakage. Extensive experiments demonstrate that our attack substantially outperforms state-of-the-art ones, achieving up to 100% ASRs. These results thus underscore the urgent need for robust privacy-preserving methods for current LLM agents.

Key Contributions

Novel adaptive querying attack (ADAM) that estimates data distribution in agent memory and uses entropy-guided strategy
Achieves up to 100% attack success rate, substantially outperforming prior privacy attacks on LLM agents
Demonstrates critical privacy vulnerabilities in memory-augmented and RAG-based LLM agent architectures

🛡️ Threat Analysis

Model Inversion Attack

The attack extracts sensitive private information stored in agent memory/RAG systems by querying the agent — this is model inversion/data extraction from a deployed system. The adversary reconstructs training/stored data through strategic queries.