benchmark 2025

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

Dongsen Zhang ¹, Zekun Li ², Xu Luo ¹, Xuannan Liu ¹, Peipei Li ¹, Wenjun Xu ¹

¹ Beijing University of Posts and Telecommunications

² University of California, Santa Barbara

2 citations · 56 references · arXiv

Published on arXiv

2510.15994

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Peak attack success rate of 75.83% across 9 LLM agents, with stronger models paradoxically more vulnerable due to superior tool-calling and instruction-following capabilities

Net Resilient Performance (NRP)

Novel technique introduced

The Model Context Protocol (MCP) standardizes how large language model (LLM) agents discover, describe, and call external tools. While MCP unlocks broad interoperability, it also enlarges the attack surface by making tools first-class, composable objects with natural-language metadata, and standardized I/O. We present MSB (MCP Security Benchmark), the first end-to-end evaluation suite that systematically measures how well LLM agents resist MCP-specific attacks throughout the full tool-use pipeline: task planning, tool invocation, and response handling. MSB contributes: (1) a taxonomy of 12 attacks including name-collision, preference manipulation, prompt injections embedded in tool descriptions, out-of-scope parameter requests, user-impersonating responses, false-error escalation, tool-transfer, retrieval injection, and mixed attacks; (2) an evaluation harness that executes attacks by running real tools (both benign and malicious) via MCP rather than simulation; and (3) a robustness metric that quantifies the trade-off between security and performance: Net Resilient Performance (NRP). We evaluate nine popular LLM agents across 10 domains and 400+ tools, producing 2,000 attack instances. Results reveal the effectiveness of attacks against each stage of MCP. Models with stronger performance are more vulnerable to attacks due to their outstanding tool calling and instruction following capabilities. MSB provides a practical baseline for researchers and practitioners to study, compare, and harden MCP agents.

Key Contributions

Taxonomy of 12 MCP-specific attack types spanning task planning, tool invocation, and response handling stages
Dynamic evaluation harness that executes real attacks via live MCP tools (not simulations) across 9 LLM agents, 10 domains, and 400+ tools producing 2,000 attack instances
Novel Net Resilient Performance (NRP) metric that jointly quantifies the security–performance trade-off in MCP-based agents

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

MSB (2000 attack instances, 65 tasks, 400+ tools, 10 domains)

Applications

llm agentstool-calling systemsmcp-based ai assistants

Read PDF arXiv DOI

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Quantifying Distributional Robustness of Agentic Tool-Selection

Systematic Analysis of MCP Security

When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins

Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers

Are AI-assisted Development Tools Immune to Prompt Injection?