arXiv — GAAP: An AI Agent Execution Environment to Safeguard User Data

AI relevance: GAAP provides a deterministic execution environment for AI agents that enforces user-defined data-sharing policies without trusting the agent, the LLM, or the user prompt to be free of attacks — directly addressing the prompt injection and data exfiltration risks inherent in agentic workflows.

  • Robert Stanley published a paper on arXiv (2604.19657) presenting GAAP (Guarded Agent Access Platform), an execution environment for safeguarding user data in AI agent workflows.
  • GAAP collects permission specifications from users through dynamic prompts, then deterministically enforces that the agent's disclosures of private data comply with those specifications — without trusting the LLM or requiring the agent to be injection-free.
  • The evaluation suite uses a benchmark with 10 MCP servers (48 tools total), testing tasks across real-world scenarios including three prompt injection attacks: SSN leakage, phone number exfiltration, and SSN-swap attacks.
  • Key design principle: the enforcement layer operates independently of the LLM, so prompt injection cannot bypass data-sharing controls. This separates the trust boundary from the model itself.
  • The approach addresses a critical gap: most current defenses rely on LLM-level prompt filtering or input sanitization, both of which are vulnerable to increasingly sophisticated injection techniques.
  • GAAP's guarantee is deterministic rather than probabilistic — either the policy is enforced or the operation is blocked, eliminating the false-negative risk inherent in LLM-based detection.

Why it matters

As AI agents gain access to sensitive data (emails, documents, credentials), the question of how to prevent data exfiltration — especially under prompt injection — becomes critical. GAAP's architecture of policy enforcement at the execution-environment level, rather than the model level, is a promising direction for building agents that can safely handle private data even when exposed to adversarial inputs.

What to do

  • Evaluate agent architectures that separate data-access enforcement from LLM reasoning, rather than relying on prompt-level defenses alone.
  • For agents handling sensitive data, implement allow-lists for which tools can access which data categories.
  • Monitor GAAP's development as a potential building block for privacy-preserving agent deployment.

Sources