Large Language Models (LLMs) are increasingly integrated into real-world
applications via the Model Context Protocol (MCP), a universal, open standard
for connecting AI agents with data sources and external tools. While MCP
enhances the capabilities of LLM-based agents, it also introduces new security
risks and expands their attack surfaces. In this paper, we present the first
systematic taxonomy of MCP security, identifying 17 attack types across 4
primary attack surfaces. We introduce MCPSecBench, a comprehensive security
benchmark and playground that integrates prompt datasets, MCP servers, MCP
clients, attack scripts, and protection mechanisms to evaluate these attacks
across three major MCP providers. Our benchmark is modular and extensible,
allowing researchers to incorporate custom implementations of clients, servers,
and transport protocols for systematic security assessment. Experimental
results show that over 85% of the identified attacks successfully compromise at
least one platform, with core vulnerabilities universally affecting Claude,
OpenAI, and Cursor, while prompt-based and tool-centric attacks exhibit
considerable variability across different hosts and models. In addition,
current protection mechanisms have little effect against these attacks.
Overall, MCPSecBench standardizes the evaluation of MCP security and enables
rigorous testing across all MCP layers.