on CyberGym benchmarks for real-world vulnerability discovery
Current capabilities
Model capabilities have outpaced cyberdefense
Model capability has moved faster in the past year than most defenders have planned for. Three recent benchmarks show what changed.
Model capability is doubling every 0.7 months
Across 41 real-world vulnerabilities, Mythos Preview took control of the system in 21 — no other model got past two, including GPT 5.5 and Opus 4.7. As capability like this becomes widely available, writing working attacks stops requiring expert skill, and the cost of an unpatched vulnerability rises with it.
The first model that gets past modern security walls
A year ago, the most capable models could spot flaws but couldn't easily turn them into working attacks. Today they can. Mythos Preview is the first to reliably get past the safety walls modern software is built with — staying high through the hardest tiers while every other model collapses.
Putting frontier modelsto work for defense
As part of Project Glasswing, Mozilla brought Mythos Preview into their Firefox security review. The April release shipped 271 fixes for latent bugs found with the model — more than 20× the team's monthly average. Some had survived decades of human review.
Project Glasswing preview
A principled approach to Claude Mythos access
Claude Mythos is a research preview model with significantly stronger cybersecurity capabilities, especially in exploit reasoning. This capability carries the greatest potential for misuse in security, and we’re rolling out access carefully as we work toward general access.
Securing critical software
Preview partners maintain critical infrastructure or software the world depends on, where a successful attack would be catastrophic.
Building towards general access
Anthropic is developing the safeguards required to release this capability broadly. The preview is how we learn to do that responsibly.
Providing tools for defenders today
Claude Security, the open-source reference tools, and the practices emerging from the preview are available to all security teams today.
Security for evolving needs
One security-tuned model, two ways to use it: build your own with the Claude Developer Platform, or deploy Claude Security on your code.
Find and fix vulnerabilities with Claude Security
Claude Security reasons about your code like a security researcher: scanning for vulnerabilities, validating findings, and proposing targeted patches.
Deploy security agents with the Claude Developer Platform
Ship defender tools and custom security agents with sandboxed execution, credential isolation, and audit logging built in via the Agent SDK, MCP, and Claude API.
Leverage frontier models for defense
Opus leads CyberGym for vulnerability discovery, with real-time safeguards on by default. Cybersecurity practitioners can apply to use it for verified defense.
In the workflow
Sicherheit für sich ändernde Anforderungen
Überlegene Denkweisen und Antworten von menschlicher Qualität.
Cyber defense powered by Claude Opus, available through our partners
Claude Security
Wie Sicherheitsteams Claude verwenden
Schwachstellenerkennung und -beseitigung
Sie erhalten eine Übersicht über Schwachstellen und entsprechende Korrekturvorschläge in einem einzigen Workflow. Claude verfolgt die Datenströme in Ihrer gesamten Codebasis, stellt fest, ob ein Ergebnis anfällig ist, entwirft einen Patch, der den Mustern Ihrer Codebasis folgt, und öffnet einen Pull-Request zur Überprüfung durch Ihr Team.
Claude Developer Platform
Building defender agents and products with Claude
Build security products
Integrate Claude's reasoning into your security platform or product through the API and Agent SDK.
- Connect Claude to your scanning, alerting, and remediation workflows through MCP
- Spawn specialized subagents for parallel tasks like triage, severity scoring, and patch generation
- Deploy in sandboxed containers with network controls, credential isolation, and audit logging built into the SDK
Project Glasswing preview
Insights from our most capable model
Claude Mythos is a research preview model tuned for advanced vulnerability discovery, exploit reasoning, and autonomous security investigation. Mythos extends what Opus can do on the hardest classes of security work.


