Anthropic confirmed: its next model is a cybersecurity crisis waiting to happen
Anthropic has confirmed what an accidental data leak already revealed: it is testing a model so capable that the company has privately warned senior government officials it makes large-scale cyberattacks significantly more likely in 2026.
The model, internally called Claude Mythos, represents "a step change" and is "the most capable we've built to date," an Anthropic spokesperson told Fortune. The company acknowledged training and testing the model with early access customers, describing it as dramatically outperforming Claude Opus 4.6 on software coding, academic reasoning, and cybersecurity benchmarks.
The leak itself was mundane in mechanism if not in consequence. A configuration error in Anthropic's content management system left a draft blog post describing the model in an unsecured, publicly searchable data cache. Roy Paz, a senior AI security researcher at LayerX Security, and Alexandre Pauwels, a cybersecurity researcher at the University of Cambridge, independently located the documents. In total, roughly 3,000 assets linked to Anthropic's blog that had not been previously published were publicly accessible. Anthropic removed access after Fortune informed the company.
The draft blog post, reviewed by Fortune and the security researchers, described Claude Mythos as "by far the most powerful AI model we've ever developed" and introduced "Capybara" as a new tier above Opus — larger, more intelligent, and more expensive. The document acknowledged the model's cybersecurity implications with unusual candor. "It presages an upcoming wave of models that can exploit vulnerabilities in ways that far outpace the efforts of defenders," the draft stated.
What Anthropic did next is what makes this story different from a typical product leak. The company privately briefed top government officials that Mythos makes large-scale cyberattacks much more likely in 2026, Axios reported. The specific warning: AI agents running at this capability level can coordinate multiple hacking campaigns simultaneously, at a scale that overwhelms existing defenses.
That proactive disclosure is notable. Companies do not usually send their lawyers to warn Washington about the dangers of a product that hasn't shipped yet.
The timing matters. A federal judge blocked the Pentagon's attempt to designate Anthropic a supply-chain risk and ban Claude from government work, Fortune reported, calling the notion "Orwellian." But the cybersecurity case for caution is now being made directly by Anthropic itself — not in a court filing, but in a private briefing room.
Anthropic said the model is expensive to run and not yet ready for general release. It is working on making it more efficient before any public launch. Cybersecurity stocks slumped following the initial reports of the leak.
The broader implication is not unique to Anthropic. As AI agent systems improve, they become useful to attackers as well as defenders. In one documented case Anthropic disclosed to Fortune, a Chinese state-sponsored hacking group ran a coordinated campaign using Claude Code to infiltrate roughly 30 organizations — including tech companies, financial institutions, and government agencies — before detection.
The model is real. The warning has been delivered. What happens next depends on whether the release process can keep pace with the risk.
† Add attribution (e.g., "according to court records" or name the source) or footnote: "Source-reported; not independently verified."
†† Add a source attribution (e.g., "according to Anthropic" or "as detailed in...") or footnote: "Source-reported; not independently verified."