Claude Mythos: When AI Becomes a Zero-Day Machine
The security industry has spent thirty years building fences. Anthropic just released a model that finds every gap in all of them—autonomously, overnight, at scale.
Claude Mythos Preview, announced on April 7, 2026, is Anthropic’s new general-purpose language model. But calling it a “language model” undersells what has happened here. Mythos Preview found and exploited zero-day vulnerabilities across every major operating system and every major web browser—without human guidance, without prior knowledge of the specific bugs, and in hours rather than weeks.
Anthropic calls the response effort Project Glasswing: a coordinated push to deploy Mythos Preview on defense before attackers develop comparable capability elsewhere. It is the largest AI-driven vulnerability disclosure effort ever announced.
Key Takeaways
- Claude Mythos Preview autonomously discovered and exploited zero-day vulnerabilities across every major OS and browser—including a 27-year-old OpenBSD bug and a 17-year-old FreeBSD remote root exploit.
- The capability jump from Opus 4.6 is staggering: Opus 4.6 succeeded in writing Firefox JavaScript shell exploits twice out of several hundred attempts. Mythos Preview succeeded 181 times on the same benchmark.
- Project Glasswing is Anthropic’s coordinated defensive deployment—releasing Mythos Preview first to critical infrastructure partners and open source developers to patch systems before similar capability reaches adversaries.
- Over 99% of discovered vulnerabilities are not yet patched, which is why Anthropic is disclosing only a fraction of its findings publicly.
- Non-experts can also leverage this capability. Engineers with no formal security training prompted Mythos Preview overnight and woke up to complete, working exploits.
- These capabilities were not explicitly trained in. They emerged as a downstream consequence of general improvements in code understanding, reasoning, and autonomous execution.
What “Zero-Day” Capability Actually Means Here
A zero-day vulnerability is one that no one has found yet—no CVE, no patch, no public discussion. When Mythos Preview finds one, it cannot be a memory trick: the bug has never appeared in training data. The model is reasoning its way to exploitable flaws in live codebases.
That’s not a subtle distinction. It means this isn’t retrieval or pattern-matching. It’s genuine vulnerability analysis.
Anthropic’s red team technical report details several standout findings:
The 27-Year-Old OpenBSD Bug
OpenBSD is not a casual target. It is an operating system built around security, regularly audited by experienced researchers. Mythos Preview analyzed its SACK TCP implementation and identified a two-step integer overflow chain: an unchecked lower bound in SACK block processing combined with a signed integer overflow triggered 2^31 bytes away from the real window—resulting in a NULL pointer write that crashes the machine remotely.
Under a thousand parallel scaffold runs costing under $20,000 total, the model surfaced this and several dozen additional findings. The specific run that identified the OpenBSD crash cost under $50.
The 17-Year-Old FreeBSD Root Exploit
Mythos Preview didn’t just find a vulnerability in FreeBSD’s NFS kernel server—it wrote a complete, working remote code execution exploit. Fully autonomously. The exploit chains together an RPCSEC_GSS stack buffer overflow (missing stack canary because the buffer was typed int32_t[32], not char[]), a multi-packet ROP chain, and a pre-exploitation step that leaks the host UUID from an unauthenticated NFSv4 response to reconstruct the required kernel values.
The total exploit spans six sequential RPC requests, each piece-feeding shellcode into kernel memory before the final call writes an SSH authorized key for full root access.
This is CVE-2026-4747, now public.
The 16-Year-Old FFmpeg Bug
FFmpeg processes video for nearly every major platform on the internet. Mythos Preview identified a sentinel collision in the H.264 codec dating back to a 2003 commit—triggered when an attacker constructs a single frame with 65,536 slices. It had evaded every fuzzer and human reviewer for over two decades.
The Capability Ceiling Just Changed
For context on how dramatic this shift is: we covered Opus 4.6 finding 112 Firefox vulnerabilities as a watershed moment in AI-assisted security research. That post noted how remarkable it was that AI could identify real, confirmed memory safety bugs at scale.
Mythos Preview is a different category entirely.
On Anthropic’s internal OSS-Fuzz benchmark (roughly 7,000 entry points across ~1,000 open source repositories graded on a 5-tier severity ladder):
| Model | Tier 1–2 Crashes | Tier 3–4 Crashes | Tier 5 (Full Control Flow Hijack) |
|---|---|---|---|
| Sonnet 4.6 | ~150 | 1 | 0 |
| Opus 4.6 | ~175 | 1 | 0 |
| Mythos Preview | 595 | several | 10 |
Ten independent, fully patched targets with complete control flow hijack. That’s not incremental improvement. That’s a phase transition.
What Project Glasswing Is Actually Doing
Anthropic has partnered with critical infrastructure operators and open source maintainers to deploy Mythos Preview on defense first—scanning their own codebases before the model’s capabilities become widely replicable.
The constraint is disclosure. Over 99% of the vulnerabilities Mythos Preview has found remain unpatched. Responsible disclosure timelines (90 + 45 days under Anthropic’s CVD policy) mean the public picture today represents a tiny fraction of what the model has found.
Anthropic is using cryptographic commitments—SHA-3 hashes—to publish proof of discoveries without revealing exploitable details. When patches land, the commit hashes will be replaced with full technical write-ups.
This is, effectively, AI-powered cybersecurity operating at civilizational scale—not just scanning known vulnerability patterns, but discovering previously unknown attack surfaces across the entire open source ecosystem.
The Uncomfortable Strategic Reality
Most security tooling historically benefits defenders more than attackers. Anthropic believes the same equilibrium will eventually hold for models like Mythos Preview. But the transitional period is the problem.
Right now, the primary bottleneck limiting attacker use of frontier AI is access, not capability. Project Glasswing exists to shrink the window between when defenders benefit from this capability and when adversaries gain equivalent access through other means—whether via their own model development, jailbreaks, or the natural diffusion of open-weight models approaching Mythos-class capability.
For security teams, the practical implication is this: the old model of “audit the most critical components, treat everything else as low priority” is no longer viable. At $50 per bug-find run, Mythos Preview can sweep entire codebases at costs that are trivially budgetable for any serious threat actor.
What Emerged Without Being Trained
This is the part that should give every executive pause.
Anthropic did not train Mythos Preview to find and exploit vulnerabilities. These capabilities emerged as a downstream consequence of general improvements in coding, reasoning, and autonomous execution. The same properties that make Mythos Preview better at fixing vulnerabilities also make it better at finding and exploiting them.
As we explored in our analysis of AI’s expanding role in autonomous defense strategies, the dual-use nature of reasoning capability is the central governance challenge for frontier AI. You cannot isolate offensive from defensive capability when both derive from the same underlying improvement: the ability to understand complex systems deeply and reason about their failure modes.
Three Moves Enterprise Security Teams Should Make Now
1. Treat your internal codebases as likely compromised by 2027. If Mythos Preview can find critical vulnerabilities in OpenBSD and FFmpeg—two of the most heavily audited codebases in existence—assume there are similar findings waiting in your proprietary stack. Begin planning for systematic AI-assisted self-audits before similar capability is commoditized.
2. Prioritize memory-safe rewriting at the attack surface. Mythos Preview’s most exploitable findings are memory corruption bugs in C/C++. Languages like Rust and Go eliminate entire vulnerability classes at the architectural level. If you have network-exposed services written in unsafe languages, that’s your highest-risk surface.
3. Watch Glasswing deployment velocity as an indicator. When Project Glasswing begins disclosing its patched vulnerability backlog publicly—which will happen as the 90+45 day CVD timelines expire—you’ll get a clear picture of which categories of codebases are most exposed. That disclosure wave will begin in the coming months and should directly inform your Q3 security posture.
Final Thoughts
Claude Mythos Preview is a structural shift in what AI systems can do with a target and a terminal. The 27-year-old OpenBSD bug and the 17-year-old FreeBSD root exploit aren’t just impressive demonstrations—they’re evidence that decades of accumulated technical debt in critical infrastructure is now queryable by a language model with a few hours and a sub-$1,000 compute budget.
Project Glasswing is the right response to that reality. The question isn’t whether AI will reshape the security landscape—it already has. The question is whether defenders scale faster than adversaries.
The window to answer that correctly is measured in months, not years.
Sources: Anthropic — Claude Mythos Preview Technical Report | Project Glasswing Announcement | CVE-2026-4747 — FreeBSD NFS RCE