defense 2025

IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents

Hengyu An ¹, Jinghuai Zhang ², Tianyu Du ¹, Chunyi Zhou ¹, Qingming Li ¹, Tao Lin ³, Shouling Ji ¹

¹ Zhejiang University

² University of California, Los Angeles

³ Westlake University

0 citations

Published on arXiv

2508.15310

Prompt Injection

OWASP LLM Top 10 — LLM01

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Key Finding

IPIGuard achieves a superior balance between task effectiveness and robustness against indirect prompt injection on the AgentDojo benchmark by structurally constraining tool invocations.

IPIGuard (Tool Dependency Graph)

Novel technique introduced

Large language model (LLM) agents are widely deployed in real-world applications, where they leverage tools to retrieve and manipulate external data for complex tasks. However, when interacting with untrusted data sources (e.g., fetching information from public websites), tool responses may contain injected instructions that covertly influence agent behaviors and lead to malicious outcomes, a threat referred to as Indirect Prompt Injection (IPI). Existing defenses typically rely on advanced prompting strategies or auxiliary detection models. While these methods have demonstrated some effectiveness, they fundamentally rely on assumptions about the model's inherent security, which lacks structural constraints on agent behaviors. As a result, agents still retain unrestricted access to tool invocations, leaving them vulnerable to stronger attack vectors that can bypass the security guardrails of the model. To prevent malicious tool invocations at the source, we propose a novel defensive task execution paradigm, called IPIGuard, which models the agents' task execution process as a traversal over a planned Tool Dependency Graph (TDG). By explicitly decoupling action planning from interaction with external data, IPIGuard significantly reduces unintended tool invocations triggered by injected instructions, thereby enhancing robustness against IPI attacks. Experiments on the AgentDojo benchmark show that IPIGuard achieves a superior balance between effectiveness and robustness, paving the way for the development of safer agentic systems in dynamic environments.

Key Contributions

Proposes IPIGuard, a defensive paradigm that models LLM agent task execution as a traversal over a pre-planned Tool Dependency Graph (TDG), structurally separating action planning from external data interaction.
Demonstrates that decoupling planning from data retrieval significantly reduces unintended tool invocations triggered by injected instructions, providing structural rather than model-inherent security.
Evaluates on AgentDojo benchmark, showing a superior robustness-utility tradeoff compared to prompting-based and detection-model-based defenses.

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_time

Datasets

AgentDojo

Applications

llm agentsagentic ai systems

Read PDF arXiv Code

IPIGuard: A Novel Tool Dependency Graph-Based Defense Against Indirect Prompt Injection in LLM Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

SMCP: Secure Model Context Protocol

VIGIL: Defending LLM Agents Against Tool Stream Injection via Verify-Before-Commit

Securing the Model Context Protocol: Defending LLMs Against Tool Poisoning and Adversarial Attacks

Breaking the Protocol: Security Analysis of the Model Context Protocol Specification and Prompt Injection Vulnerabilities in Tool-Integrated LLM Agents

ToolSafe: Enhancing Tool Invocation Safety of LLM-based agents via Proactive Step-level Guardrail and Feedback

Introducing the Generative Application Firewall (GAF)

AgentSentry: Mitigating Indirect Prompt Injection in LLM Agents via Temporal Causal Diagnostics and Context Purification

Quantifying Conversation Drift in MCP via Latent Polytope