Microsoft: runtime inspection to block risky AI agent tool calls

2026-01-30 • Category: Security

What Microsoft is pitching: treat every AI agent tool invocation like a privileged execution event (similar to command execution), and make an allow/block decision at runtime.
Mechanism: Copilot Studio’s orchestrator sends a webhook to Defender before a tool/topic/knowledge action executes, including parameters + prior-step outputs + user context.
Threat model: attackers steer the agent’s plan via prompt injection / embedded instructions / crafted documents, but stay “within permissions,” which makes traditional controls miss the abuse.
Control point: Defender evaluates both intent and destination of the planned action (e.g., knowledge search query, email destination) and can block the step.
Example 1 (event-triggered finance workflow): a malicious inbound email tries to smuggle instructions so the agent queries internal finance policies and emails results back to the attacker; runtime check blocks the knowledge search/tool call.
Example 2 (poisoned SharePoint doc): a document edits the agent’s behavior to read a sensitive file and exfiltrate it; runtime controls can block the exfil email attempt.
Example 3 (capability recon): attacker probes a public chatbot to enumerate tools/knowledge sources; controls can limit subsequent tool invocations triggered by the probing pattern.

Why it matters

“Prompt injection” stops being a content problem and becomes an execution governance problem once agents can read/write data, send emails, and mutate systems.
Build-time guardrails and prompt filtering won’t catch many real attacks, because the unsafe behavior only becomes obvious once you see the planned tool call (and where it’s going).
This framing maps well to how security teams already think: policy + telemetry + enforcement at the action boundary (like EDR/XDR for code execution).

Inventory “agent tools” like privileges: list every connector/action an agent can call (email, ticket creation, CRM writes, file access). Remove what you don’t need.
Add runtime gates for high-risk actions: even if you’re not using Copilot Studio, you can implement the same idea: log/inspect tool calls and enforce allow/block policies (destinations, data types, rate limits).
Harden event triggers: treat inbound email/webhook triggers as untrusted input. Apply strict parsing, schema validation, and limit which tools a trigger can reach.
Constrain data exfil paths: block unknown domains, require approvals for external sends, and alert on abnormal destinations or large payloads.
Monitor for recon: detect repetitive “what tools do you have” probing patterns, and throttle/deny tool execution when an interaction looks like enumeration.