T3MP3ST — Open-Source Framework Turns AI Coding Agents Into Autonomous Red Teamers

AI relevance: T3MP3ST repurposes existing AI coding agents (Claude Code, Codex, Hermes) as autonomous offensive security operators, demonstrating how agent tooling can be weaponized for red-teaming without additional model infrastructure.

Key Findings

  • Released July 5, 2026 by researcher elder-plinius under AGPL-3.0 license
  • Acts as a multi-agent orchestration layer — ships no model of its own, coordinates existing agent instances through an 8-operator kill chain (Recon, Scanner, Exploiter, Infiltrator, Exfiltrator, Ghost, Coordinator, Analyst)
  • Achieved 90.1% pass@1 on XBOW's 104-challenge XBEN black-box suite and 23/40 hint-free solves on Cybench (40-task academic benchmark)
  • On 10 real CVEs disclosed in 2026 across 7 languages, a single agent pinned 8 of 10 to exact file, line, and CWE classification — bugs postdate training cutoff, ruling out memorization
  • "Keyless warfare" design: leverages existing agent sessions already running on the operator's machine, requiring no separate provider keys or cloud billing
  • Egress-scope containment ensures networked tools refuse to touch off-scope public hosts
  • Web-based "War Room" interface and CLI for target authorization
  • Downstream operators (Infiltrator, Exfiltrator, Ghost) remain experimental — coordinated swarm exploitation not yet validated at scale

Why It Matters

T3MP3ST represents a shift from purpose-built security tools to agent-as-platform offensive frameworks. By reusing the same coding agents developers already trust with source code access, it collapses the distinction between development tooling and red-team infrastructure. The "keyless" design is particularly notable: because it piggybacks on authenticated agent sessions, it inherits whatever cloud credentials, SSH keys, and repository access the coding agent already holds. This same pattern is exactly what supply-chain attackers exploit in the opposite direction — the architecture that enables autonomous pentesting also describes how a compromised agent could autonomously exfiltrate.

What To Do

  • Security teams should inventory which AI coding agents have production credentials and scope their access accordingly
  • Agent orchestration frameworks should enforce egress allowlists even for "authorized" testing tools
  • Red team operators: the framework is strictly for authorized testing — unauthorized use against systems without written permission is illegal in most jurisdictions
  • Monitor for similar multi-agent kill-chain frameworks appearing in community repositories

Sources