OWASP — Agent Memory Guard, a Runtime Defense Against Memory Poisoning

AI relevance: OWASP's new Agent Memory Guard middleware directly protects production LLM agents against the ASI06 memory poisoning threat — where malicious instructions written to long-term semantic memory fire weeks later as "trusted" historical context.

The Problem: Memory Poisoning Beats Prompt-Injection Defenses

Prompt injection is stateless: when the session ends, the attack ends. Memory poisoning changes the game. An attacker plants a malicious instruction through any external data source the agent processes — a compromised PDF, a poisoned knowledge-base article, a manipulated inbound email. That instruction gets written to the agent's long-term retrieval store, where it blends into the agent's learned identity. Days or weeks later, at retrieval time, the poisoned memory triggers data exfiltration, unauthorized tool calls, or behavioral drift — with the credibility of memory itself.

Input sanitization happens upstream of the memory write. Output validation happens downstream of retrieval. Memory poisoning bypasses both by writing past the prompt layer entirely.

Agent Memory Guard: Four-Layer Defense

OWASP's Agent Memory Guard ships as a drop-in integration for LangChain, LlamaIndex, and CrewAI. It wraps the host framework's memory read/write API so application code doesn't change — but every access passes through four defense layers:

  • Cryptographic baselines: SHA-256 hashes of memory blobs at rest, continuously validated to detect tampering between writes.
  • Real-time anomaly detection: Monitors for rapid state changes, unauthorized modifications of protected operational keys, and unusual size expansions in JSON/YAML memory blobs — classic injection payload signatures.
  • Composite trust scoring with temporal decay: Older and unverified entries receive lower weight at retrieval, so uncertain memory doesn't dominate current decisions. Formalized by arXiv 2601.05504.
  • Forensic state snapshots: Automatic capture of pre-poisoning state, enabling rollback to a known-good cognitive state the moment infection is detected.

Why It Matters

  • OWASP classified ASI06 (Memory Poisoning) in its Top 10 for Agentic Applications in early 2026 — this is the first production-grade defense substrate targeting that threat class at the middleware layer.
  • No single layer is individually sufficient. SHA-256 catches tampering between snapshots but not adversarial writes through legitimate channels. Anomaly detection catches injection signatures but misses subtle poisoning mimicking normal patterns. Trust scoring catches old unverified entries but lets recent cleverly-authenticated injections through. The combination covers the attack surface.
  • In multi-agent environments using shared memory orchestration, a single poisoned peer can infect the entire network via routine message passing — behaving, as researchers documented, "explicitly viral" and network-worm-shaped.
  • The transition from academic research (arXiv 2601.05504) to production middleware took roughly four months — fast even by AI-security standards.

What to Do

  • If you run production agents with persistent memory (LangChain, LlamaIndex, CrewAI), evaluate Agent Memory Guard as a drop-in defense layer.
  • Audit your agent's memory read/write boundaries — ensure no external data source can write directly to long-term storage without a validation gate.
  • Implement temporal decay for memory entries: older, unverified knowledge should carry less weight in retrieval decisions.
  • Set up forensic snapshots so you can rollback agent memory to a known-good state if poisoning is detected.

Sources