benchmark 2025

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

0 citations

Published on arXiv

2508.13220

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

All four MCP attack surfaces yield successful compromises across Claude, OpenAI, and Cursor; existing protection mechanisms average below 30% effectiveness, with core vulnerabilities universally affecting all three platforms.

MCPSecBench

Novel technique introduced

Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and significantly expands their attack surface. In this paper, we present the first formalization of a secure MCP and its required specifications. Based on this foundation, we establish a comprehensive MCP security taxonomy that extends existing models by incorporating protocol-level and host-side threats, identifying 17 distinct attack types across four primary attack surfaces. Building on these specifications, we introduce MCPSecBench, a systematic security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, attack scripts, a GUI test harness, and protection mechanisms to evaluate these threats across three major MCP platforms. MCPSecBench is designed to be modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for rigorous assessment. Our evaluation across three major MCP platforms reveals that all attack surfaces yield successful compromises. Core vulnerabilities universally affect Claude, OpenAI, and Cursor, while server-side and specific client-side attacks exhibit considerable variability across different hosts and models. Furthermore, current protection mechanisms proved largely ineffective, achieving an average success rate of less than 30%. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all protocol layers.

Key Contributions

First formalization of secure MCP specifications and a comprehensive taxonomy of 17 MCP attack types across four attack surfaces (protocol-level, host-side, server-side, client-side)
MCPSecBench: a modular, extensible benchmark integrating prompt datasets, MCP servers/clients, attack scripts, GUI test harness, and protection mechanisms for evaluating MCP security
Empirical evaluation across Claude, OpenAI, and Cursor showing universal core vulnerabilities and that current protection mechanisms achieve less than 30% average success rate

🛡️ Threat Analysis

Details

Domains

nlp

Model Types

llm

Threat Tags

black_boxinference_time

Datasets

MCPSecBench prompt datasets (introduced by paper)

Applications

llm agentsai coding assistantsmcp-integrated chatbots

Read PDF arXiv

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Systematic Analysis of MCP Security

MCP Security Bench (MSB): Benchmarking Attacks Against Model Context Protocol in LLM Agents

When AI Meets the Web: Prompt Injection Risks in Third-Party AI Chatbot Plugins

Unsafer in Many Turns: Benchmarking and Defending Multi-Turn Safety Risks in Tool-Using Agents

Quantifying Distributional Robustness of Agentic Tool-Selection

Skill-Inject: Measuring Agent Vulnerability to Skill File Attacks

MCPTox: A Benchmark for Tool Poisoning Attack on Real-World MCP Servers

Model Context Protocol Threat Modeling and Analyzing Vulnerabilities to Prompt Injection with Tool Poisoning