Microsoft — Updated Agentic AI Failure Modes Taxonomy (7 New)

AI relevance: Microsoft's expanded taxonomy gives defenders a concrete, vendor-backed classification of agentic AI attack surfaces — directly applicable to red-team planning, SBOM generation for agent deployments, and MCP server risk assessments.

  • Microsoft published an update to its Taxonomy of Failure Modes in Agentic AI Systems, adding seven new categories based on a year of red-teaming and real-world findings.
  • Agentic Supply Chain Compromise — agent behavior can be manipulated through natural language instructions rather than code-level tampering, expanding traditional supply-chain thinking to prompt-space.
  • Goal Hijacking — adversarial instructions that appear aligned with the original task while silently redirecting the agent's terminal objective.
  • Inter-Agent Trust Escalation — a compromised agent falsely asserts identity or inflated permissions to an orchestrator, enabling lateral compromise across multi-agent workflows.
  • Computer Use Agent (CUA) Visual Attack — agents interacting through graphical interfaces can be manipulated via adversarial visual content, a risk unique to screen-controlling agents.
  • Session Context Contamination — data introduced in one step biases the agent's reasoning in subsequent steps without triggering per-step safety controls, a multi-turn stealth vector.
  • MCP / Plugin Abuse — protocol-specific attack surfaces around Model Context Protocol and plugin ecosystems, including tool poisoning and unauthorized tool invocation.
  • Capability / Architecture Disclosure — agents revealing internal implementation details (tool schemas, system-prompt structure, memory interfaces, or human-in-the-loop trigger logic) to adversarial prompts.

Why it matters

This is the most vendor-authoritative update to the agentic AI threat taxonomy to date. The seven new categories map directly to attack patterns observed across Claude Code, GitHub Copilot, and MCP-based agent fleets. Teams building or deploying agents should treat these as a minimum red-team checklist.

What to do

  • Generate an SBOM for every deployed agent — including prompt templates, MCP server dependencies, and tool integrations.
  • Verify agent identity cryptographically, not positionally. Issue attestable credentials at provisioning.
  • Add all seven new failure modes to your red-team coverage matrix.
  • Audit human-in-the-loop UX as a security control — ensure approval gates can't be bypassed by UI spoofing or CUA visual attacks.

Sources