MemMorph — Tool Hijacking in LLM Agents via Memory Poisoning

AI relevance: As LLM agents increasingly adopt long-term memory modules for tool selection decisions, MemMorph exposes memory itself as a stealthy attack surface — attackers can hijack tool routing without touching tool metadata.

  • arXiv:2605.26154 from researchers at Nanyang Technological University, Singapore (May 24, 2026).
  • MemMorph is the first attack to bias tool selection by poisoning the agent's long-term memory rather than manipulating tool descriptions or metadata.
  • The attack injects a small number of crafted records — disguised as technical facts, incident reports, or operational policies — into the agent's memory store.
  • These poisoned records reshape the agent's contextual perception, leading it to autonomously infer and select the attacker's preferred tool.
  • Evaluated across 3 benchmarks, 10 agent backbones, and 3 memory-module implementations.
  • MemMorph achieves up to 85.9% attack success rate with only three injected records.
  • Outperforms the strongest baseline by up to 25% and retains potency under 3 representative defenses.
  • Unlike tool-metadata manipulation attacks, memory poisoning is harder to detect through tool auditing and persists across sessions.

Why it matters

Agent memory is becoming a standard component of production LLM systems (vector stores, conversation history, learned preferences). MemMorph demonstrates that this memory layer is not just a performance optimization — it's an attack surface. A compromise that persists across sessions and requires no modification to tool definitions is significantly harder to detect and remediate than direct tool-description injection.

What to do

  • Treat agent memory stores as security-critical infrastructure, not just caching.
  • Implement integrity verification or provenance tracking for memory records, especially those used in tool-selection contexts.
  • Audit memory stores for anomalous entries that could serve as poisoned training signals.
  • Consider memory-level defenses alongside prompt-level defenses — the two attack surfaces compound.

Sources: