benchmark 2025

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Yixuan Yang 1,2, Cuifeng Gao 2, Daoyuan Wu 2, Yufan Chen 2, Yingjiu Li 3, Shuai Wang 4

0 citations

α

Published on arXiv

2508.13220

Insecure Plugin Design

OWASP LLM Top 10 — LLM07

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

All four MCP attack surfaces yield successful compromises across Claude, OpenAI, and Cursor; existing protection mechanisms average below 30% effectiveness, with core vulnerabilities universally affecting all three platforms.

MCPSecBench

Novel technique introduced


Large Language Models (LLMs) are increasingly integrated into real-world applications via the Model Context Protocol (MCP), a universal open standard for connecting AI agents with data sources and external tools. While MCP enhances the capabilities of LLM-based agents, it also introduces new security risks and significantly expands their attack surface. In this paper, we present the first formalization of a secure MCP and its required specifications. Based on this foundation, we establish a comprehensive MCP security taxonomy that extends existing models by incorporating protocol-level and host-side threats, identifying 17 distinct attack types across four primary attack surfaces. Building on these specifications, we introduce MCPSecBench, a systematic security benchmark and playground that integrates prompt datasets, MCP servers, MCP clients, attack scripts, a GUI test harness, and protection mechanisms to evaluate these threats across three major MCP platforms. MCPSecBench is designed to be modular and extensible, allowing researchers to incorporate custom implementations of clients, servers, and transport protocols for rigorous assessment. Our evaluation across three major MCP platforms reveals that all attack surfaces yield successful compromises. Core vulnerabilities universally affect Claude, OpenAI, and Cursor, while server-side and specific client-side attacks exhibit considerable variability across different hosts and models. Furthermore, current protection mechanisms proved largely ineffective, achieving an average success rate of less than 30%. Overall, MCPSecBench standardizes the evaluation of MCP security and enables rigorous testing across all protocol layers.


Key Contributions

  • First formalization of secure MCP specifications and a comprehensive taxonomy of 17 MCP attack types across four attack surfaces (protocol-level, host-side, server-side, client-side)
  • MCPSecBench: a modular, extensible benchmark integrating prompt datasets, MCP servers/clients, attack scripts, GUI test harness, and protection mechanisms for evaluating MCP security
  • Empirical evaluation across Claude, OpenAI, and Cursor showing universal core vulnerabilities and that current protection mechanisms achieve less than 30% average success rate

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llm
Threat Tags
black_boxinference_time
Datasets
MCPSecBench prompt datasets (introduced by paper)
Applications
llm agentsai coding assistantsmcp-integrated chatbots