Brex — CrabTrap Open-Source LLM-as-a-Judge Proxy for AI Agent Security

AI relevance: CrabTrap addresses the core security gap in production AI agents — uncontrolled outbound API calls to real systems like Slack, Gmail, and GitHub — by intercepting every request and evaluating it against a security policy before it reaches the internet.

What happened

  • Brex (now a Capital One subsidiary) released CrabTrap as an open-source project: an HTTP/HTTPS proxy designed specifically to secure AI agents making outbound API calls.
  • CrabTrap sits between the agent and the internet, intercepting every outbound request and evaluating it against a security policy — blocking or forwarding in real time.
  • It uses a two-tier evaluation model: deterministic static rules (prefix, exact, or glob URL patterns) are checked first; if no rule matches, an LLM judge evaluates the request against a natural-language security policy.
  • Deny rules always take priority over allow rules, and every decision is logged to PostgreSQL for a complete audit trail.
  • Built-in SSRF protection blocks requests to private networks (RFC 1918, loopback, link-local, Carrier-Grade NAT) with DNS-rebinding prevention.
  • Prompt injection defense: request payloads are JSON-encoded and policy content is JSON-escaped before being sent to the LLM judge.
  • Additional features: per-IP rate limiting (token bucket), circuit breaker (trips after 5 consecutive LLM failures), configurable fallback (deny or passthrough when LLM is unavailable), and an agentic policy builder that drafts policies from observed traffic.
  • Runs as a Docker container alongside PostgreSQL; setup takes roughly 30 minutes. Code is at github.com/brexhq/CrabTrap.

Why it matters

  • Production AI agents make real API calls to real systems — and their behavior is non-deterministic. A prompt-injected agent can modify CRM records, send emails, or exfiltrate data through legitimate API endpoints.
  • CrabTrap's two-tier model is practical: static rules handle the obvious cases with zero latency, while the LLM judge catches semantically ambiguous violations that regex-based WAFs miss.
  • The project is notable because it comes from Brex, a company that runs AI agents in production at enterprise scale — this is a tool born from real operational need, not theoretical research.
  • Crucially, it's a forward proxy (outbound-only). It doesn't replace WAFs or inbound firewalls — it fills the gap that most agent security tools ignore: what the agent sends out.

What to do

  • If you deploy AI agents that call external APIs, evaluate CrabTrap as an additional control layer between agents and the internet.
  • Start with static deny rules for known-bad destinations (private networks, sensitive internal APIs) before configuring LLM policies.
  • Use the policy builder to draft initial policies from observed traffic, then refine with eval replay against historical audit logs.
  • Be aware: CrabTrap does not provide human-in-the-loop approval, filter API responses, or redact sensitive data. It's one layer, not a complete solution.

Sources