defense 2025

Securing AI Agent Execution

Christoph Bühler ¹, Matteo Biagiola ^1,2, Luca Di Grazia ¹, Guido Salvaneschi ¹

¹ University of St. Gallen

² Università della Svizzera italiana

7 citations · 50 references · arXiv

Published on arXiv

2510.21236

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Excessive Agency

OWASP LLM Top 10 — LLM08

Key Finding

AgentBound automatically generates correct access control policies for 80.9% of MCP servers and blocks the majority of security threats in malicious MCP servers with negligible enforcement overhead.

AgentBound

Novel technique introduced

Large Language Models (LLMs) have evolved into AI agents that interact with external tools and environments to perform complex tasks. The Model Context Protocol (MCP) has become the de facto standard for connecting agents with such resources, but security has lagged behind: thousands of MCP servers execute with unrestricted access to host systems, creating a broad attack surface. In this paper, we introduce AgentBound, the first access control framework for MCP servers. AgentBound combines a declarative policy mechanism, inspired by the Android permission model, with a policy enforcement engine that contains malicious behavior without requiring MCP server modifications. We build a dataset containing the 296 most popular MCP servers, and show that access control policies can be generated automatically from source code with 80.9% accuracy. We also show that AgentBound blocks the majority of security threats in several malicious MCP servers, and that policy enforcement engine introduces negligible overhead. Our contributions provide developers and project managers with a practical foundation for securing MCP servers while maintaining productivity, enabling researchers and tool builders to explore new directions for declarative access control and MCP security.

Key Contributions

AgentBound: first access control framework for MCP servers, combining a declarative Android-inspired permission model with a sandboxed policy enforcement engine requiring no MCP server modifications
Dataset of 296 popular MCP servers with automatic policy generation from source code achieving 80.9% accuracy using LLM-based analysis
Empirical evaluation showing AgentBound blocks the majority of threats from malicious MCP servers with negligible runtime overhead

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

inference_timeblack_box

Datasets

296 MCP servers (custom dataset)

Applications

llm agentsmcp server securityai agent tool access control

Read PDF arXiv DOI

Securing AI Agent Execution

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

From Tool Orchestration to Code Execution: A Study of MCP Design Choices

Autonomous Action Runtime Management(AARM):A System Specification for Securing AI-Driven Actions at Runtime

MiniScope: A Least Privilege Framework for Authorizing Tool Calling Agents

TraceAegis: Securing LLM-Based Agents via Hierarchical and Behavioral Anomaly Detection

Tracking Capabilities for Safer Agents

Agentic JWT: A Secure Delegation Protocol for Autonomous AI Agents

Execution Is the New Attack Surface: Survivability-Aware Agentic Crypto Trading with OpenClaw-Style Local Executors

Mind the Gap: Time-of-Check to Time-of-Use Vulnerabilities in LLM-Enabled Agents