Microsoft: runtime inspection to block risky AI agent tool calls

• Category: Security

  • What Microsoft is pitching: treat every AI agent tool invocation like a privileged execution event (similar to command execution), and make an allow/block decision at runtime.
  • Mechanism: Copilot Studio’s orchestrator sends a webhook to Defender before a tool/topic/knowledge action executes, including parameters + prior-step outputs + user context.
  • Threat model: attackers steer the agent’s plan via prompt injection / embedded instructions / crafted documents, but stay “within permissions,” which makes traditional controls miss the abuse.
  • Control point: Defender evaluates both intent and destination of the planned action (e.g., knowledge search query, email destination) and can block the step.
  • Example 1 (event-triggered finance workflow): a malicious inbound email tries to smuggle instructions so the agent queries internal finance policies and emails results back to the attacker; runtime check blocks the knowledge search/tool call.
  • Example 2 (poisoned SharePoint doc): a document edits the agent’s behavior to read a sensitive file and exfiltrate it; runtime controls can block the exfil email attempt.
  • Example 3 (capability recon): attacker probes a public chatbot to enumerate tools/knowledge sources; controls can limit subsequent tool invocations triggered by the probing pattern.

Why it matters

  • “Prompt injection” stops being a content problem and becomes an execution governance problem once agents can read/write data, send emails, and mutate systems.
  • Build-time guardrails and prompt filtering won’t catch many real attacks, because the unsafe behavior only becomes obvious once you see the planned tool call (and where it’s going).
  • This framing maps well to how security teams already think: policy + telemetry + enforcement at the action boundary (like EDR/XDR for code execution).

What to do

  1. Inventory “agent tools” like privileges: list every connector/action an agent can call (email, ticket creation, CRM writes, file access). Remove what you don’t need.
  2. Add runtime gates for high-risk actions: even if you’re not using Copilot Studio, you can implement the same idea: log/inspect tool calls and enforce allow/block policies (destinations, data types, rate limits).
  3. Harden event triggers: treat inbound email/webhook triggers as untrusted input. Apply strict parsing, schema validation, and limit which tools a trigger can reach.
  4. Constrain data exfil paths: block unknown domains, require approvals for external sends, and alert on abnormal destinations or large payloads.
  5. Monitor for recon: detect repetitive “what tools do you have” probing patterns, and throttle/deny tool execution when an interaction looks like enumeration.

Sources