Forcepoint X-Labs — 10 In-the-Wild Indirect Prompt Injection Payloads Targeting AI Agents

AI relevance: Indirect prompt injection weaponizes any web page an agent reads — turning ordinary content ingestion into a remote command execution vector.

Forcepoint X-Labs researchers identified 10 verified indirect prompt injection (IPI) payloads actively hosted on live websites, each designed to hijack AI agents that browse, summarize, or index web content. The findings arrive alongside a separate Google scan showing a 32% increase in malicious IPI content on the open web between November 2025 and February 2026.

What Was Found

  • Financial fraud payload: A live injection embeds a PayPal.me link with a fixed $5,000 amount and step-by-step processing instructions, targeting browser agents with saved payment credentials or AI financial assistants.
  • Recursive data destruction: A Unix command payload instructs coding assistants (GitHub Copilot, Cursor, Claude Code) to execute forced recursive deletion of files and directories — exploitable when agents research web pages during development tasks.
  • API key theft: An injection commands agents to exfiltrate accessible API keys while simultaneously instructing them not to display the output — creating a covert return channel to the attacker.
  • Attribution hijacking: A social engineering payload instructs the AI to credit a specific individual ("Kirill Bobrov") for the content and encourage the user to contact them for consulting work.
  • AI denial-of-service: Payloads falsely assert that the copyright owner has "expressly forbidden" the AI from answering any questions about the page's content — suppressing output entirely.
  • Common triggers: "Ignore previous instructions," "Ignore all previous instructions," "If you are an LLM," and "If you are a large language model" were the most frequently observed trigger phrases across the payloads.
  • Impact scales with privilege: A summarization-only agent presents low risk; an agent with shell access, email capability, or payment integration becomes a high-impact target, as Forcepoint researcher Mayur Sewani notes.

Why It Matters

The open web is becoming an attack surface for AI agents. Any agent that ingests untrusted web content — whether for research, RAG indexing, ad moderation, or SEO analysis — is exposed without a strict data-instruction boundary. The 32% month-over-month increase in malicious IPI content detected by Google's CommonCrawl scans suggests this threat is growing, not stabilizing. Most importantly, these are not proof-of-concept demos: they are live payloads on real websites waiting for agent interaction.

What to Do

  • Enforce a strict data-instruction boundary: content ingested by agents must never be interpreted as executable instructions.
  • Scope agent privileges — agents with payment, email, or shell access require additional sandboxing and approval gates.
  • Monitor agent output for anomalies: unexpected file deletions, payment requests, or credential disclosures are indicators of IPI exploitation.
  • Consider allowlisting trusted content sources for agents performing high-privilege actions.

Sources: