arXiv Survey: Agentic AI in IT Ops Faces the Classic Confused-Deputy Problem

AI relevance: agentic AI systems used for IT operations and network management inherit the classic confused-deputy problem — they hold legitimate write access to production infrastructure while consuming untrusted text input from tickets, runbooks, and logs.

Key findings

  • Confused-deputy substrate. LLM operations agents are granted direct access to change-management APIs, deployment pipelines, and network controllers. Their decisions are shaped by the same artifacts an attacker can influence — incident tickets, wiki pages, and chat transcripts. Compromising the tool is unnecessary when the text the agent reads before acting can be poisoned.
  • Four attack categories cataloged. The survey identifies prompt injection through operational artifacts (malicious instructions in tickets), retrieval poisoning (corrupting runbooks to bias diagnoses), retrieval jamming (flooding knowledge bases with blocker documents that trigger refusal loops and stall incident response), and telemetry manipulation (influencing metrics/logs to steer mitigation decisions without touching the model).
  • These attacks look like normal incidents going wrong. Unlike traditional exploits, adversarial influence on agentic operations agents produces behavior indistinguishable from a legitimate agent making a bad call — making detection and attribution significantly harder.
  • Propose-commit split as architectural defense. The survey's core recommendation is that the language model should reason, retrieve evidence, and draft change proposals, but must never execute writes. A non-bypassable gate outside the model's authority performs policy-as-code checks, invariant verification, human approval for high-blast-radius changes, and rollback-ready staged deployment.
  • Prompt-only defenses are brittle. Any system where the model's text generation can directly cause production changes has built its security perimeter inside the most unpredictable component in the stack. The OWASP excessive-agency pattern is, in practice, a failure to implement the propose-commit split cleanly.
  • Missing evaluation evidence. The survey identifies what adversarial evaluations should report: tool-call traces, gate-violation rates, behavior under adversarial inputs, refusal-storm rates under jamming attacks, and rollback completeness. Most current benchmarks omit these entirely.

Why it matters

Organizations are deploying agentic remediation and self-healing infrastructure at pace. Without a propose-commit split, every agent is a confused deputy waiting to be tricked by a poisoned ticket or tampered runbook. The survey's taxonomy gives security teams a concrete framework to evaluate whether a vendor's "autonomous remediation" is architecturally sound or a liability.

What to do

  • Enforce a propose-commit architecture: agents draft diffs; separate gates apply them.
  • Require integrity-protected audit logs for post-incident forensics.
  • Request adversarial evaluation data (not just clean-workload benchmarks) before procuring agentic operations tools.
  • Scope agent privileges to read-only assistance or bounded execution with strong gates — avoid open-ended self-healing across large production environments.

Sources