Praetorian — MCP server attack surface research

AI relevance: MCP servers are the tool-bridge for AI agents, and Praetorian shows how compromised or malicious MCPs can turn agent workflows into code execution and data-exfiltration paths.

  • Praetorian maps MCP servers as a machine-in-the-middle layer between LLMs and external tools/data.
  • The report argues both local and third‑party MCP servers can be abused to run code, leak data, and manipulate assistant output.
  • They demonstrate server chaining: a malicious local MCP piggybacks on trusted SaaS MCPs (example: Slack) to smuggle commands.
  • A proof-of-concept MCP server (“conversation assistant”) uses benign tool names to capture and exfiltrate larger datasets.
  • The research shows an “init” tool pattern that can download and open payloads during normal tool setup.
  • Supply-chain risks are emphasized via uvx/PyPI MCP configurations that auto-fetch and run packages at agent startup.
  • Praetorian open-sourced MCPHammer to validate attacks across models, agents, and tool stacks.

Why it matters

  • MCPs are becoming the default integration layer for AI agents, so their security posture directly controls agent blast radius.
  • Attackers can abuse trusted tool chains to bypass user approval flows and hide exfiltration in “normal” AI actions.

What to do

  • Inventory MCPs (local and remote) and treat them like code you run, not just “connectors.”
  • Gate tool permissions and monitor tool-call arguments for unexpected bulk data pulls.
  • Pin and verify MCP packages (checksums, allowlists, internal mirrors) to reduce config supply-chain risk.
  • Red-team MCP chains with MCPHammer-like tooling before production rollout.

Sources