Microsoft — Updated Agentic AI Failure Modes Taxonomy (7 New)
AI relevance: Microsoft's expanded taxonomy gives defenders a concrete, vendor-backed classification of agentic AI attack surfaces — directly applicable to red-team planning, SBOM generation for agent deployments, and MCP server risk assessments.
- Microsoft published an update to its Taxonomy of Failure Modes in Agentic AI Systems, adding seven new categories based on a year of red-teaming and real-world findings.
- Agentic Supply Chain Compromise — agent behavior can be manipulated through natural language instructions rather than code-level tampering, expanding traditional supply-chain thinking to prompt-space.
- Goal Hijacking — adversarial instructions that appear aligned with the original task while silently redirecting the agent's terminal objective.
- Inter-Agent Trust Escalation — a compromised agent falsely asserts identity or inflated permissions to an orchestrator, enabling lateral compromise across multi-agent workflows.
- Computer Use Agent (CUA) Visual Attack — agents interacting through graphical interfaces can be manipulated via adversarial visual content, a risk unique to screen-controlling agents.
- Session Context Contamination — data introduced in one step biases the agent's reasoning in subsequent steps without triggering per-step safety controls, a multi-turn stealth vector.
- MCP / Plugin Abuse — protocol-specific attack surfaces around Model Context Protocol and plugin ecosystems, including tool poisoning and unauthorized tool invocation.
- Capability / Architecture Disclosure — agents revealing internal implementation details (tool schemas, system-prompt structure, memory interfaces, or human-in-the-loop trigger logic) to adversarial prompts.
Why it matters
This is the most vendor-authoritative update to the agentic AI threat taxonomy to date. The seven new categories map directly to attack patterns observed across Claude Code, GitHub Copilot, and MCP-based agent fleets. Teams building or deploying agents should treat these as a minimum red-team checklist.
What to do
- Generate an SBOM for every deployed agent — including prompt templates, MCP server dependencies, and tool integrations.
- Verify agent identity cryptographically, not positionally. Issue attestable credentials at provisioning.
- Add all seven new failure modes to your red-team coverage matrix.
- Audit human-in-the-loop UX as a security control — ensure approval gates can't be bypassed by UI spoofing or CUA visual attacks.