T3MP3ST — Open-Source Framework Turns AI Coding Agents Into Autonomous Red Teamers
AI relevance: T3MP3ST repurposes existing AI coding agents (Claude Code, Codex, Hermes) as autonomous offensive security operators, demonstrating how agent tooling can be weaponized for red-teaming without additional model infrastructure.
Key Findings
- Released July 5, 2026 by researcher elder-plinius under AGPL-3.0 license
- Acts as a multi-agent orchestration layer — ships no model of its own, coordinates existing agent instances through an 8-operator kill chain (Recon, Scanner, Exploiter, Infiltrator, Exfiltrator, Ghost, Coordinator, Analyst)
- Achieved 90.1% pass@1 on XBOW's 104-challenge XBEN black-box suite and 23/40 hint-free solves on Cybench (40-task academic benchmark)
- On 10 real CVEs disclosed in 2026 across 7 languages, a single agent pinned 8 of 10 to exact file, line, and CWE classification — bugs postdate training cutoff, ruling out memorization
- "Keyless warfare" design: leverages existing agent sessions already running on the operator's machine, requiring no separate provider keys or cloud billing
- Egress-scope containment ensures networked tools refuse to touch off-scope public hosts
- Web-based "War Room" interface and CLI for target authorization
- Downstream operators (Infiltrator, Exfiltrator, Ghost) remain experimental — coordinated swarm exploitation not yet validated at scale
Why It Matters
T3MP3ST represents a shift from purpose-built security tools to agent-as-platform offensive frameworks. By reusing the same coding agents developers already trust with source code access, it collapses the distinction between development tooling and red-team infrastructure. The "keyless" design is particularly notable: because it piggybacks on authenticated agent sessions, it inherits whatever cloud credentials, SSH keys, and repository access the coding agent already holds. This same pattern is exactly what supply-chain attackers exploit in the opposite direction — the architecture that enables autonomous pentesting also describes how a compromised agent could autonomously exfiltrate.
What To Do
- Security teams should inventory which AI coding agents have production credentials and scope their access accordingly
- Agent orchestration frameworks should enforce egress allowlists even for "authorized" testing tools
- Red team operators: the framework is strictly for authorized testing — unauthorized use against systems without written permission is illegal in most jurisdictions
- Monitor for similar multi-agent kill-chain frameworks appearing in community repositories