arXiv — LLM-agent threat model and attack taxonomy survey

2026-02-26 Research by al-ice.ai Editorial

AI relevance: The paper maps real-world LLM agent failures (prompt injection, tool abuse, protocol exploits) into a unified threat model that applies directly to AI agent deployments.

The survey proposes an end-to-end threat model for LLM-agent ecosystems, covering host-to-tool and agent-to-agent communications.
It catalogs 30+ attack techniques spanning input manipulation, model compromise, system/privacy abuse, and protocol-level vulnerabilities.
The taxonomy explicitly connects prompt-to-SQL injections with higher-layer agent workflows rather than treating them as isolated app bugs.
Protocol-level examples include tool-schema confusion and cross-agent message tampering in multi-agent workflows.
The authors highlight weak validation and ad-hoc auth across plugins/connectors as a systemic amplifier of risk.
Mitigations emphasize dynamic trust management, provenance tracking, and sandboxing of tool interfaces.
The paper maps incidents to CVE/NVD records to ground the taxonomy in real, published vulnerabilities.

Why it matters

Agent security failures increasingly blend classic injection bugs with protocol exploits, making narrow app-only reviews insufficient.
Unified taxonomies help teams prioritize controls across prompts, tools, and inter-agent protocols instead of patching one layer at a time.
By linking to public CVEs, the survey provides a defensible checklist for audits and risk sign-off.

What to do

Threat model by layer: separate input, tool, and protocol risks in your agent design reviews.
Harden tool boundaries: enforce schema validation and least-privilege scopes on tool calls.
Track provenance: log tool responses and agent decisions to support forensic triage.
Map to CVEs: use the paper’s CVE/NVD mapping as a starting point for patch prioritization.

arXiv — LLM-agent threat model and attack taxonomy survey

Why it matters

What to do

Sources