arXiv — The Promptware Kill Chain: reframing prompt injection as multi-step malware
• Category: Research
- New framing: Nassi, Schneier, and Brodt (arXiv 2601.09625, Jan 14 2026) argue that the catch-all term "prompt injection" obscures a more complex reality — attacks against LLM-based systems increasingly resemble multi-step malware campaigns rather than isolated input manipulations.
- "Promptware" as a malware class: the paper coins the term promptware to describe adversarial payloads that execute in natural-language space rather than machine code but follow systematic sequences analogous to traditional malware kill chains.
- Five-stage kill chain model:
- Initial Access — prompt injection (direct or indirect) delivers the payload via user input, poisoned documents, emails, websites, or RAG data.
- Privilege Escalation — jailbreaking techniques bypass safety training and guardrails.
- Persistence — memory and retrieval poisoning ensures the payload survives across sessions, corrupting the agent's long-term knowledge.
- Lateral Movement — the attack propagates across users, devices, connected services, or peer agents in multi-agent architectures.
- Actions on Objective — data exfiltration, unauthorized transactions, system compromise, or other attacker goals.
- Mapped to real attacks: the authors demonstrate the framework by mapping documented incidents — including EchoLeak (CVE-2025-32711) against Microsoft Copilot, RAG poisoning campaigns, and cross-agent propagation scenarios — to the five kill chain stages.
- Why existing defenses fail: traditional prompt injection defenses focus on input filtering (stage 1). The kill chain model shows that by the time injection is detected, the agent may have already escalated, persisted, moved laterally, and acted — each stage requires its own controls.
- Common vocabulary: the framework provides a shared terminology for AI safety and cybersecurity practitioners, bridging two communities that often describe the same attacks in incompatible language.
- Implication for autonomous agents: as agents gain tool access, persistent memory, and multi-agent communication, promptware campaigns become increasingly viable — the attack surface at each kill chain stage grows with agent capability.
Why it matters
- The kill chain model transforms prompt injection from a single-point vulnerability into a structured threat-modeling methodology, giving security teams a systematic way to identify and address attack progression at each stage.
- By drawing explicit parallels to traditional malware analysis (think Lockheed Martin Cyber Kill Chain), the paper makes LLM security legible to conventional security teams — critical as AI agent deployments move into enterprise production.
- The persistence and lateral movement stages are particularly underappreciated: most organizations focus on input filtering and miss that a successful injection can self-replicate through memory systems and agent-to-agent communication.
What to do
- Adopt kill-chain thinking: map your LLM/agent deployments against all five stages — don't stop at input validation. Ask: if injection succeeds, can the attacker escalate, persist, move, and act?
- Implement defense-in-depth: controls at each stage — input validation + jailbreak detection + memory integrity checks + agent isolation + action authorization with human-in-the-loop for high-impact operations.
- Audit memory and RAG pipelines: treat persistent memory and vector stores as potential persistence mechanisms. Implement provenance tracking and periodic integrity verification.
- Restrict lateral movement: in multi-agent architectures, enforce trust boundaries between agents — validate inter-agent messages the same way you'd validate external input.