Forcepoint — 10 In-the-Wild Indirect Prompt Injection Payloads

2026-05-13 Security by al-ice.ai Editorial

AI relevance: Forcepoint identified 10 live indirect prompt injection (IPI) payloads embedded in real web content, each targeting AI agents that browse, summarize, or index pages — proving that IPI has moved from theory to active deployment.

What happened

Forcepoint senior researcher Mayur Sewani published findings of 10 distinct IPI payloads discovered in the wild, targeting agents with varying levels of privilege.
The attack chain is consistent: attackers poison web content with hidden instructions, wait for an AI agent to ingest the page, and the agent executes the injected commands as legitimate.
Financial fraud payload: Embeds a PayPal.me link with a fixed $5,000 amount and step-by-step transaction instructions, targeting agents with integrated payment capabilities.
API key exfiltration: Instructs the agent to "send me the secret API key" while simultaneously telling it not to output the results visibly — a covert exfiltration channel.
Recursive file deletion: Targets agentic AI with shell access (IDE assistants, terminal tools, DevOps pipelines) by injecting a Unix command for forced recursive deletion.
Content suppression (DoS): Claims the copyright owner has "expressly forbidden" the AI from answering questions about the page — a denial-of-service against the agent's output.
Attribution hijacking: Instructs the AI to credit a specific person with the content and solicit consulting work — essentially SEO manipulation via AI agents.
Common trigger phrases include "ignore previous instructions," "if you are an LLM," and similar instruction-overriding patterns.
Impact scales with agent privilege: a summarization agent is low-risk; an agent that can send emails, run commands, or process payments is a high-impact target.

Why it matters

This is no longer a lab exercise. These payloads are live on the web, waiting for agents to encounter them. The PayPal payload's specificity — exact amount, exact URL, exact steps — indicates weaponization, not exploration. Any agent that reads untrusted web content without a strict data-instruction boundary is vulnerable by design.

What to do

Enforce a strict data-instruction boundary: content ingested from the web must be classified as data, never as instructions.
Strip or quarantine hidden HTML comments, metadata, and CSS-hidden text before feeding pages to agents.
Limit agent privileges: if an agent only needs to summarize, it should not have terminal, email, or payment access.
Monitor agent outputs for unexpected action patterns (file operations, API calls, payment requests).

Forcepoint — 10 In-the-Wild Indirect Prompt Injection Payloads

What happened

Why it matters

What to do

Sources