defense 2026

Optimizing Agent Planning for Security and Autonomy

Aashish Kolluri ¹, Rishi Sharma ^1,2, Manuel Costa ¹, Boris Köpf ¹, Tobias Nießen ³, Mark Russinovich ¹, Shruti Tople ¹, Santiago Zanella-Béguelin ¹

¹ Microsoft

² EPFL

³ TU Wien

0 citations · 34 references · arXiv (Cornell University)

Published on arXiv

2602.11416

Prompt Injection

OWASP LLM Top 10 — LLM01

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

Security-aware agent planning (Prudentia) achieves higher autonomous action rates without sacrificing task completion utility, demonstrating that deterministic IFC defenses against prompt injection need not be prohibitively costly when paired with autonomy-aware planning.

Prudentia

Novel technique introduced

Indirect prompt injection attacks threaten AI agents that execute consequential actions, motivating deterministic system-level defenses. Such defenses can provably block unsafe actions by enforcing confidentiality and integrity policies, but currently appear costly: they reduce task completion rates and increase token usage compared to probabilistic defenses. We argue that existing evaluations miss a key benefit of system-level defenses: reduced reliance on human oversight. We introduce autonomy metrics to quantify this benefit: the fraction of consequential actions an agent can execute without human-in-the-loop (HITL) approval while preserving security. To increase autonomy, we design a security-aware agent that (i) introduces richer HITL interactions, and (ii) explicitly plans for both task progress and policy compliance. We implement this agent design atop an existing information-flow control defense against prompt injection and evaluate it on the AgentDojo and WASP benchmarks. Experiments show that this approach yields higher autonomy without sacrificing utility.

Key Contributions

Autonomy metrics that quantify the fraction of consequential actions an agent can execute without human-in-the-loop approval while preserving security guarantees
Prudentia: a security-aware agent design that combines richer HITL interactions with explicit planning for both task progress and policy compliance
Empirical demonstration on AgentDojo and WASP that security-aware planning improves autonomy without sacrificing task utility under information-flow control defenses

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_time

Datasets

AgentDojoWASP

Applications

ai agentsagentic ai systemsllm tool-use pipelines

Read PDF arXiv DOI

Optimizing Agent Planning for Security and Autonomy

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

A2AS: Agentic AI Runtime Security and Self-Defense

Spider-Sense: Intrinsic Risk Sensing for Efficient Agent Defense with Hierarchical Adaptive Screening

AgentSentinel: An End-to-End and Real-Time Security Defense Framework for Computer-Use Agents

Policy Compiler for Secure Agentic Systems

BlockA2A: Towards Secure and Verifiable Agent-to-Agent Interoperability

The LLMbda Calculus: AI Agents, Conversations, and Information Flow

Agent-Sentry: Bounding LLM Agents via Execution Provenance

Securing AI Agents: Implementing Role-Based Access Control for Industrial Applications