Brex — CrabTrap Open-Source LLM-as-a-Judge Proxy for AI Agent Security
AI relevance: CrabTrap addresses the core security gap in production AI agents — uncontrolled outbound API calls to real systems like Slack, Gmail, and GitHub — by intercepting every request and evaluating it against a security policy before it reaches the internet.
What happened
- Brex (now a Capital One subsidiary) released CrabTrap as an open-source project: an HTTP/HTTPS proxy designed specifically to secure AI agents making outbound API calls.
- CrabTrap sits between the agent and the internet, intercepting every outbound request and evaluating it against a security policy — blocking or forwarding in real time.
- It uses a two-tier evaluation model: deterministic static rules (prefix, exact, or glob URL patterns) are checked first; if no rule matches, an LLM judge evaluates the request against a natural-language security policy.
- Deny rules always take priority over allow rules, and every decision is logged to PostgreSQL for a complete audit trail.
- Built-in SSRF protection blocks requests to private networks (RFC 1918, loopback, link-local, Carrier-Grade NAT) with DNS-rebinding prevention.
- Prompt injection defense: request payloads are JSON-encoded and policy content is JSON-escaped before being sent to the LLM judge.
- Additional features: per-IP rate limiting (token bucket), circuit breaker (trips after 5 consecutive LLM failures), configurable fallback (deny or passthrough when LLM is unavailable), and an agentic policy builder that drafts policies from observed traffic.
- Runs as a Docker container alongside PostgreSQL; setup takes roughly 30 minutes. Code is at github.com/brexhq/CrabTrap.
Why it matters
- Production AI agents make real API calls to real systems — and their behavior is non-deterministic. A prompt-injected agent can modify CRM records, send emails, or exfiltrate data through legitimate API endpoints.
- CrabTrap's two-tier model is practical: static rules handle the obvious cases with zero latency, while the LLM judge catches semantically ambiguous violations that regex-based WAFs miss.
- The project is notable because it comes from Brex, a company that runs AI agents in production at enterprise scale — this is a tool born from real operational need, not theoretical research.
- Crucially, it's a forward proxy (outbound-only). It doesn't replace WAFs or inbound firewalls — it fills the gap that most agent security tools ignore: what the agent sends out.
What to do
- If you deploy AI agents that call external APIs, evaluate CrabTrap as an additional control layer between agents and the internet.
- Start with static deny rules for known-bad destinations (private networks, sensitive internal APIs) before configuring LLM policies.
- Use the policy builder to draft initial policies from observed traffic, then refine with eval replay against historical audit logs.
- Be aware: CrabTrap does not provide human-in-the-loop approval, filter API responses, or redact sensitive data. It's one layer, not a complete solution.