defense 2026

An AI Agent Execution Environment to Safeguard User Data

Robert Stanley 1, Avi Verma 1, Lillian Tsai 2, Konstantinos Kallas 1, Sam Kumar 1

0 citations

α

Published on arXiv

2604.19657

Prompt Injection

OWASP LLM Top 10 — LLM01

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Blocks all data disclosure attacks including prompt injection exfiltration attempts that bypass other state-of-the-art systems, without significant impact on agent utility

GAAP

Novel technique introduced


AI agents promise to serve as general-purpose personal assistants for their users, which requires them to have access to private user data (e.g., personal and financial information). This poses a serious risk to security and privacy. Adversaries may attack the AI model (e.g., via prompt injection) to exfiltrate user data. Furthermore, sharing private data with an AI agent requires users to trust a potentially unscrupulous or compromised AI model provider with their private data. This paper presents GAAP (Guaranteed Accounting for Agent Privacy), an execution environment for AI agents that guarantees confidentiality for private user data. Through dynamic and directed user prompts, GAAP collects permission specifications from users describing how their private data may be shared, and GAAP enforces that the agent's disclosures of private user data, including disclosures to the AI model and its provider, comply with these specifications. Crucially, GAAP provides this guarantee deterministically, without trusting the agent with private user data, and without requiring any AI model or the user prompt to be free of attacks. GAAP enforces the user's permission specification by tracking how the AI agent accesses and uses private user data. It augments Information Flow Control with novel persistent data stores and annotations that enable it to track the flow of private information both across execution steps within a single task, and also over multiple tasks separated in time. Our evaluation confirms that GAAP blocks all data disclosure attacks, including those that make other state-of-the-art systems disclose private user data to untrusted parties, without a significant impact on agent utility.


Key Contributions

  • GAAP execution environment that enforces user privacy specifications deterministically without trusting the AI agent or model provider
  • Novel persistent information flow control mechanism tracking private data across multiple agent tasks separated in time
  • Guaranteed protection against data exfiltration attacks including prompt injection, with evaluation showing 100% attack blocking rate

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_timeblack_box
Applications
ai agentspersonal assistantsmulti-step task automation