ExploitBench — AI Agents Achieve Arbitrary Code Execution on V8

AI relevance: Frontier AI agents can now autonomously develop working exploits that achieve arbitrary code execution in Google's V8 engine — the JavaScript runtime powering Chrome, Edge, Node.js, and Cloudflare Workers.

Key Findings

  • Carnegie Mellon University researchers released ExploitBench, a capability-graded benchmark that decomposes exploitation into 16 measurable flags — from triggering a bug through arbitrary code execution (T1) — verified by deterministic oracles with randomized challenges.
  • Claude Mythos Preview with human "nudges" scored 9.90/16 on average and reached full code execution (T1) on 21 of 41 V8 vulnerabilities. Fully autonomous mode scored 9.55 — barely any drop.
  • GPT-5.5 via Codex reached T1 on only 2 of 41 bugs, scoring 5.51 (assisted) and 4.30 (autonomous). No other public model achieved arbitrary code execution.
  • Mythos reproduced CVE-2024-0519 — a bug human researchers had failed to exploit for over a year — and developed an exploit technique the co-author (Seunghyun Lee, with 20+ browser CVEs) had previously dismissed as too complex.
  • Cost difference is stark: Mythos cost ~$36,428 across 122 episodes; GPT-5.5/Codex ran 123 episodes for ~$3,075 — roughly 12× cheaper but far less capable.

Why It Matters

ExploitBench is the first benchmark that measures exploit progression rather than a binary crash outcome. The results show that private frontier models have crossed a threshold: they can autonomously construct working browser exploits from known vulnerabilities, without manual steering. The V8 engine powers billions of endpoints. Even with ASLR and modern mitigations, the gap between "model can trigger a crash" and "model can build arbitrary code execution" has closed to nearly zero.

For AI security teams, this means the same models being deployed for software engineering can be repurposed as autonomous vulnerability researchers — and the cost curve is dropping fast.

What to Do

  • Browser users: Keep Chrome, Edge, and Node.js fully patched. The dataset uses known CVEs, but the techniques generalize.
  • AI platform operators: Audit agent tool access for sandboxing that prevents arbitrary code execution and network egress to internal infrastructure.
  • Security teams: Treat AI-assisted exploit development as a near-term threat — not theoretical. Plan for compressed exploit timelines as the benchmark goes public.

Sources