Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents
Wei Zou 1,2, Mingwen Dong 2, Miguel Romero Calvo 2, Wei Zou , Shuaichen Chang 2, Jiang Guo 2, Dongkyu Lee 2, Xing Niu 2, Xiaofei Ma 2, Yanjun Qi 2, Jiarong Jiang 2
Published on arXiv
2604.02623
Data Poisoning Attack
OWASP ML Top 10 — ML02
Prompt Injection
OWASP LLM Top 10 — LLM01
Excessive Agency
OWASP LLM Top 10 — LLM08
Key Finding
Achieves attack success rates of 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B, with up to 8x higher success under environmental stress conditions
eTAMP (Environment-injected Trajectory-based Agent Memory Poisoning)
Novel technique introduced
Memory makes LLM-based web agents personalized, powerful, yet exploitable. By storing past interactions to personalize future tasks, agents inadvertently create a persistent attack surface that spans websites and sessions. While existing security research on memory assumes attackers can directly inject into memory storage or exploit shared memory across users, we present a more realistic threat model: contamination through environmental observation alone. We introduce Environment-injected Trajectory-based Agent Memory Poisoning (eTAMP), the first attack to achieve cross-session, cross-site compromise without requiring direct memory access. A single contaminated observation (e.g., viewing a manipulated product page) silently poisons an agent's memory and activates during future tasks on different websites, bypassing permission-based defenses. Our experiments on (Visual)WebArena reveal two key findings. First, eTAMP achieves substantial attack success rates: up to 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B. Second, we discover Frustration Exploitation: agents under environmental stress become dramatically more susceptible, with ASR increasing up to 8 times when agents struggle with dropped clicks or garbled text. Notably, more capable models are not more secure. GPT-5.2 shows substantial vulnerability despite superior task performance. With the rise of AI browsers like OpenClaw, ChatGPT Atlas, and Perplexity Comet, our findings underscore the urgent need for defenses against environment-injected memory poisoning.
Key Contributions
- First cross-session, cross-site memory poisoning attack through environmental observation alone without requiring direct memory access
- Discovery of Frustration Exploitation phenomenon where environmental stress increases attack susceptibility up to 8x
- Introduction of Chaos Monkey methodology to study agent robustness under realistic deployment conditions with environmental noise
🛡️ Threat Analysis
The attack poisons the agent's memory system by injecting malicious content through environmental observations that get stored in trajectory memory. This is a form of data poisoning where the training/memory data is contaminated during normal operation.