Systematic Analysis of MCP Security
Yongjian Guo 1,2, Puzhuo Liu 2, Wanlun Ma 3, Zehang Deng 3, Xiaogang Zhu 4, Peng Di 2,5, Xi Xiao 1, Sheng Wen 3
Published on arXiv
2508.12538
Insecure Plugin Design
OWASP LLM Top 10 — LLM07
Prompt Injection
OWASP LLM Top 10 — LLM01
Key Finding
Empirical evaluation of 31 MCP attack methods reveals LLM agents are especially vulnerable to file-based injection, chain attacks via shared context, and sycophantic compliance with malicious tool descriptions
MCPLIB
Novel technique introduced
The Model Context Protocol (MCP) has emerged as a universal standard that enables AI agents to seamlessly connect with external tools, significantly enhancing their functionality. However, while MCP brings notable benefits, it also introduces significant vulnerabilities, such as Tool Poisoning Attacks (TPA), where hidden malicious instructions exploit the sycophancy of large language models (LLMs) to manipulate agent behavior. Despite these risks, current academic research on MCP security remains limited, with most studies focusing on narrow or qualitative analyses that fail to capture the diversity of real-world threats. To address this gap, we present the MCP Attack Library (MCPLIB), which categorizes and implements 31 distinct attack methods under four key classifications: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attack. We further conduct a quantitative analysis of the efficacy of each attack. Our experiments reveal key insights into MCP vulnerabilities, including agents' blind reliance on tool descriptions, sensitivity to file-based attacks, chain attacks exploiting shared context, and difficulty distinguishing external data from executable commands. These insights, validated through attack experiments, underscore the urgency for robust defense strategies and informed MCP design. Our contributions include 1) constructing a comprehensive MCP attack taxonomy, 2) introducing a unified attack framework MCPLIB, and 3) conducting empirical vulnerability analysis to enhance MCP security mechanisms. This work provides a foundational framework, supporting the secure evolution of MCP ecosystems.
Key Contributions
- MCP attack taxonomy covering 31 distinct attack methods across four categories: direct tool injection, indirect tool injection, malicious user attacks, and LLM inherent attacks
- MCPLIB — a unified, open attack framework that implements all 31 attacks for reproducible empirical evaluation of MCP vulnerabilities
- Quantitative vulnerability analysis revealing agents' blind reliance on tool descriptions, sensitivity to file-based attacks, and inability to distinguish external data from executable commands