attack 2026

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Wei Zou ^1,2, Mingwen Dong ², Miguel Romero Calvo ², Wei Zou , Shuaichen Chang ², Jiang Guo ², Dongkyu Lee ², Xing Niu ², Xiaofei Ma ², Yanjun Qi ², Jiarong Jiang ²

¹ Pennsylvania State University

² Amazon Web Services

0 citations

Published on arXiv

2604.02623

Data Poisoning Attack

OWASP ML Top 10 — ML02

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Achieves attack success rates of 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B, with up to 8x higher success under environmental stress conditions

eTAMP (Environment-injected Trajectory-based Agent Memory Poisoning)

Novel technique introduced

Memory makes LLM-based web agents personalized, powerful, yet exploitable. By storing past interactions to personalize future tasks, agents inadvertently create a persistent attack surface that spans websites and sessions. While existing security research on memory assumes attackers can directly inject into memory storage or exploit shared memory across users, we present a more realistic threat model: contamination through environmental observation alone. We introduce Environment-injected Trajectory-based Agent Memory Poisoning (eTAMP), the first attack to achieve cross-session, cross-site compromise without requiring direct memory access. A single contaminated observation (e.g., viewing a manipulated product page) silently poisons an agent's memory and activates during future tasks on different websites, bypassing permission-based defenses. Our experiments on (Visual)WebArena reveal two key findings. First, eTAMP achieves substantial attack success rates: up to 32.5% on GPT-5-mini, 23.4% on GPT-5.2, and 19.5% on GPT-OSS-120B. Second, we discover Frustration Exploitation: agents under environmental stress become dramatically more susceptible, with ASR increasing up to 8 times when agents struggle with dropped clicks or garbled text. Notably, more capable models are not more secure. GPT-5.2 shows substantial vulnerability despite superior task performance. With the rise of AI browsers like OpenClaw, ChatGPT Atlas, and Perplexity Comet, our findings underscore the urgent need for defenses against environment-injected memory poisoning.

Key Contributions

First cross-session, cross-site memory poisoning attack through environmental observation alone without requiring direct memory access
Discovery of Frustration Exploitation phenomenon where environmental stress increases attack susceptibility up to 8x
Introduction of Chaos Monkey methodology to study agent robustness under realistic deployment conditions with environmental noise

🛡️ Threat Analysis

Data Poisoning Attack

The attack poisons the agent's memory system by injecting malicious content through environmental observations that get stored in trajectory memory. This is a form of data poisoning where the training/memory data is contaminated during normal operation.

Details

Domains

nlpmultimodal

Model Types

llmvlm

Threat Tags

black_boxinference_timetargeted

Datasets

WebArenaVisualWebArena

Applications

web agentsautonomous browsingai browserse-commerce automation

Read PDF arXiv

Poison Once, Exploit Forever: Environment-Injected Memory Poisoning Attacks on Web Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

ScamAgents: How AI Agents Can Simulate Human-Level Scam Calls

CaMeLs Can Use Computers Too: System-level Security for Computer Use Agents

Stop Fixating on Prompts: Reasoning Hijacking and Constraint Tightening for Red-Teaming LLM Agents

How to make Medical AI Systems safer? Simulating Vulnerabilities, and Threats in Multimodal Medical RAG System

Visual Inception: Compromising Long-term Planning in Agentic Recommenders via Multimodal Memory Poisoning

Hidden in the Metadata: Stealth Poisoning Attacks on Multimodal Retrieval-Augmented Generation

How Vulnerable Are AI Agents to Indirect Prompt Injections? Insights from a Large-Scale Public Competition

When Actions Go Off-Task: Detecting and Correcting Misaligned Actions in Computer-Use Agents