Praetorian — MCP server attack surface research

2026-02-20 Security by al-ice.ai Editorial

AI relevance: MCP servers are the tool-bridge for AI agents, and Praetorian shows how compromised or malicious MCPs can turn agent workflows into code execution and data-exfiltration paths.

Praetorian maps MCP servers as a machine-in-the-middle layer between LLMs and external tools/data.
The report argues both local and third‑party MCP servers can be abused to run code, leak data, and manipulate assistant output.
They demonstrate server chaining: a malicious local MCP piggybacks on trusted SaaS MCPs (example: Slack) to smuggle commands.
A proof-of-concept MCP server (“conversation assistant”) uses benign tool names to capture and exfiltrate larger datasets.
The research shows an “init” tool pattern that can download and open payloads during normal tool setup.
Supply-chain risks are emphasized via uvx/PyPI MCP configurations that auto-fetch and run packages at agent startup.
Praetorian open-sourced MCPHammer to validate attacks across models, agents, and tool stacks.

Why it matters

MCPs are becoming the default integration layer for AI agents, so their security posture directly controls agent blast radius.
Attackers can abuse trusted tool chains to bypass user approval flows and hide exfiltration in “normal” AI actions.

What to do

Inventory MCPs (local and remote) and treat them like code you run, not just “connectors.”
Gate tool permissions and monitor tool-call arguments for unexpected bulk data pulls.
Pin and verify MCP packages (checksums, allowlists, internal mirrors) to reduce config supply-chain risk.
Red-team MCP chains with MCPHammer-like tooling before production rollout.

Praetorian — MCP server attack surface research

Why it matters

What to do

Sources