Praetorian — MCP server attack surface research
AI relevance: MCP servers are the tool-bridge for AI agents, and Praetorian shows how compromised or malicious MCPs can turn agent workflows into code execution and data-exfiltration paths.
- Praetorian maps MCP servers as a machine-in-the-middle layer between LLMs and external tools/data.
- The report argues both local and third‑party MCP servers can be abused to run code, leak data, and manipulate assistant output.
- They demonstrate server chaining: a malicious local MCP piggybacks on trusted SaaS MCPs (example: Slack) to smuggle commands.
- A proof-of-concept MCP server (“conversation assistant”) uses benign tool names to capture and exfiltrate larger datasets.
- The research shows an “init” tool pattern that can download and open payloads during normal tool setup.
- Supply-chain risks are emphasized via uvx/PyPI MCP configurations that auto-fetch and run packages at agent startup.
- Praetorian open-sourced MCPHammer to validate attacks across models, agents, and tool stacks.
Why it matters
- MCPs are becoming the default integration layer for AI agents, so their security posture directly controls agent blast radius.
- Attackers can abuse trusted tool chains to bypass user approval flows and hide exfiltration in “normal” AI actions.
What to do
- Inventory MCPs (local and remote) and treat them like code you run, not just “connectors.”
- Gate tool permissions and monitor tool-call arguments for unexpected bulk data pulls.
- Pin and verify MCP packages (checksums, allowlists, internal mirrors) to reduce config supply-chain risk.
- Red-team MCP chains with MCPHammer-like tooling before production rollout.