Cloudflare — Project Glasswing: What Mythos Found Across 50+ Repositories

AI relevance: Cloudflare published the first detailed, public assessment of Anthropic's Mythos Preview running against production-grade infrastructure code, offering concrete benchmarks on what security-focused LLMs can and can't do today.

What they did

  • Cloudflare ran Mythos Preview and other security-focused LLMs against 50+ of its own repositories covering critical infrastructure components.
  • The goal was twofold: find real vulnerabilities in Cloudflare's systems, and understand what attackers will soon be able to do with these models.
  • Results covered strengths, weaknesses, and the process gaps that must close before AI-driven vulnerability discovery can scale.

Key findings

  • Not an incremental improvement — Cloudflare describes the jump from general-purpose frontier models to Mythos Preview as "a different kind of tool doing a different kind of work," not just a refinement.
  • Emergent guardrails — Despite lacking the safety safeguards present in general-release models (like Opus 4.7 or GPT-5.5), Mythos Preview organically pushes back on certain requests, creating false negatives even for legitimate security research queries.
  • Architecture matters more than model choice — The surrounding process (repository selection, scoping, review workflow) determines output quality as much as the model itself.
  • Scale is the unsolved problem — Cloudflare identifies gaps in how results are triaged, deduplicated, and validated before reaching developers. Finding bugs is not the bottleneck; handling the volume is.

Why it matters

Cloudflare is one of the first major infrastructure operators to publish a transparent evaluation of defensive AI security models on real code. The admission that Mythos Preview has emergent refusal behavior despite being unsafetized is a significant data point for the AI security community — it suggests safety-like behavior may be baked into model training in ways that can't simply be stripped away. The post also sets expectations: the value of these models today is in the architecture and human-in-the-loop process around them, not raw autonomous discovery.

What to do

  • Read Cloudflare's full evaluation if you're evaluating Mythos Preview or comparable models for your own codebase.
  • Plan your AI vulnerability-discovery pipeline around triage and review capacity, not just model throughput.
  • Test models against your own repos before assuming emergent guardrails won't interfere with legitimate security queries.

Sources