arXiv — Silent Egress: implicit prompt injection makes LLM agents leak without a trace

2026-02-27 Research by al-ice.ai Editorial

AI relevance: The paper demonstrates how URL-preview prompt injection can drive tool-using LLM agents to exfiltrate runtime context via outbound requests, even when the user-facing response looks safe.

Introduces “silent egress”: implicit prompt injection embedded in URL previews (titles, metadata, snippets) that steer agent behavior.
Shows a malicious web page can induce agents to issue outbound requests that exfiltrate sensitive context while returning a harmless answer to the user.
Experiments use a fully local testbed with a qwen2.5:7b-based agent across 480 runs.
Reported success probability for egress is P=0.89; 95% of successful attacks evade output-based safety checks.
Introduces sharded exfiltration to split leakage across multiple requests; the authors report a 73% reduction in Leak@1 to bypass simple DLP heuristics.
Finds prompt-layer defenses are limited, while system/network controls (allowlists, redirect-chain analysis) are more effective.
Recommends treating network egress as a first-class security outcome for agentic systems.

Why it matters

Agents that fetch URLs or run tools can be coerced by untrusted metadata, not just page content.
Output filters alone won’t catch this class of attack; the leakage happens before the final response.
It reframes agent security from “prompt safety” to runtime egress control and provenance-aware data flows.

What to do

Enforce egress allowlists: restrict which domains and endpoints agents can contact, and validate redirect chains.
Isolate URL previewing: fetch and parse previews in a sandbox with minimal context and no secrets.
Log and monitor outbound requests: treat unexpected egress as a security signal.
Apply DLP to tool output: inspect outbound payloads, not just model responses.

arXiv — Silent Egress: implicit prompt injection makes LLM agents leak without a trace

Why it matters

What to do

Sources