MCPSafe — 7 Coordinated Disclosures After Scanning 50+ MCP Servers

AI relevance: MCP servers are the tool-layer for agentic AI systems — vulnerabilities in servers from Atlassian, GitHub, Anthropic, Microsoft, and Cloudflare expose the bridge between LLM instructions and real-world actions.

MCPSafe, an automated security scanning platform for Model Context Protocol servers, released findings from three months of automated analysis across 50+ MCP repositories on GitHub, npm, and PyPI. The platform uses a five-model LLM judge panel with a custom AIVSS (AI Vulnerability Severity Score) rubric. The majority of scanned servers received a grade of D or lower.

Key findings

  • Indirect prompt injection is the most common critical. Official MCP servers from Atlassian, GitHub, and Cloudflare fetch Jira tickets, GitHub issues, Confluence pages, and web content verbatim — returning user-controlled data to the LLM without provenance delimiters. An attacker who can write to those content sources can inject instructions directly into the agent's context window.
  • ReadOnlyHint mislabeling creates silent privilege escalation. GitHub's official github-mcp-server sets readOnlyHint: true on tools that can be combined in dynamic toolset mode to achieve write operations. Agents that trust this annotation may skip confirmation prompts, creating an uncontrolled escalation path.
  • SSRF in HTTP-calling tools across multiple vendors. Microsoft's playwright-mcp navigate tool accepts arbitrary URLs without allowlist validation. An attacker controlling task content can force the MCP server to probe internal infrastructure or metadata endpoints.
  • Supabase MCP received the highest severity score (AIVSS 8.8). IDOR combined with hidden prompt injection in the search_docs tool allows unauthorized document access through the agent layer.

The disclosures (D001–D007)

  • D001 — Anthropic: Indirect prompt injection in MCP servers (AIVSS 6.0, Reported)
  • D002 — Cloudflare: Tool poisoning chain via document retrieval (AIVSS 7.1, Reported)
  • D003 — Supabase: IDOR + hidden prompt injection in search_docs (AIVSS 8.8, Reported)
  • D004 — Microsoft: SSRF in playwright-mcp navigate tool (AIVSS 7.1, Reported)
  • D005 — Obsidian: SSRF in obsidian-mcp-tools fetch tool (AIVSS 7.1, Reported)
  • D006 — GitHub: ReadOnlyHint mislabeling in dynamic toolset mode (AIVSS 7.1, Reported)
  • D007 — Atlassian: Indirect prompt injection + tool poisoning via remote endpoint (AIVSS 6.0/7.1, Reported)

Why it matters

These are not obscure community projects — these are official MCP servers from major vendors. Every MCP server that returns fetched content verbatim to the LLM without provenance delimiters is an indirect prompt injection vector. For enterprises deploying agent systems, this means any external data source (Jira, GitHub, Confluence, docs) becomes an attack surface that bypasses traditional network controls.

The ReadOnlyHint mislabeling finding is particularly concerning: it shows that MCP's advisory annotations are being trusted as enforcement boundaries, which they are not by design.

What to do

  • Wrap all fetched external content in provenance delimiters before returning to the LLM (e.g., <external_content trusted="false">)
  • Audit readOnlyHint and destructiveHint annotations — only set readOnlyHint: true for tools with genuinely zero side effects
  • Validate all URL inputs against allowlists in HTTP-calling tools
  • Pin GitHub Actions to commit SHA, not @v1 tags
  • Run MCP servers as non-root users in containerized deployments

Sources