Anthropic Plans to Open Its Best Cyber Model to Everyone. It Cannot Yet Prevent Misuse.
Japan ordered a sweeping security review. India told financial institutions to patch immediately. Both directives landed within the last three weeks, both triggered by the same underlying capability: Anthropic has built a system that finds critical vulnerabilities faster than the world's developers can fix them, and the company has decided to release it anyway.
The system is called Mythos, and the release plan Anthropic announced Monday puts the company in a familiar and uncomfortable position: the safety-conscious lab that built something it cannot fully control. Mythos-class models are designed to identify and exploit zero-day vulnerabilities across every major operating system and browser. Anthropic says it needs stronger safeguards before a general release. It also says, without qualification, that no one — including itself — has figured out how to prevent the models being misused. "At present, no company—including Anthropic—has developed safeguards strong enough to prevent such models from being misused and potentially causing severe harm," The Register reported.
Japan's government ordered its review after seeing what Project Glasswing — a consortium of 13 major technology and security companies including AWS, Google, Microsoft, Cisco, and CrowdStrike — had demonstrated, according to government statements reviewed by The Register. The UK's AI Security Institute reported that Mythos Preview, the current iteration, is the first model to solve both of its cyber ranges: end-to-end simulations of multi-step attacks run without assistance. India issued its patching directive to financial institutions based on the same assessment, according to those statements. These are not theoretical concerns. They are the first concrete policy consequences of a capability Anthropic itself has called a step-change in what AI can do to software.
Anthropic shared its findings with critical partners including US and allied governments as part of the Glasswing consortium, and says it plans to expand that partnership before eventual public release. "Work with critical partners – including US and allied governments – to expand Project Glasswing to additional partners," The Register reported. "And in the near future, once we have developed the far stronger safeguards we need, we look forward to making Mythos-class models available through a general release."
What "near future" means in practice is undefined. What "stronger safeguards" would look like is also undefined. The company has said it is early in a 90-day coordinated vulnerability disclosure window. That window has been open for two months.
The gap between what the model can find and what developers can patch is not narrowing. Of 530 high-or-critical-severity bugs reported through the disclosure program, 75 have been patched. The open-source maintainers receiving these reports are overwhelmed. Several asked Anthropic to slow its disclosure rate because they lack the capacity to design patches fast enough.
This is the gap at the center of the release decision. Anthropic is not releasing Mythos because the world is ready for it. It is releasing it because the world is not ready, and the company appears to have decided that shaping how the capability spreads matters more than continuing to gatekeep it.
Anthropic frames the eventual public release as a natural evolution: stronger safeguards first, broad access after. It is a familiar argument in AI policy — the same one made for every capability deemed too dangerous to release freely. The company is not alone in this contradiction: OpenAI and DeepMind have both faced similar tensions between public safety commitments and deployment decisions. What is different here is that Anthropic is making this argument at the same moment it is publishing evidence that the defensive ecosystem cannot absorb what the model already finds. Seventy-five patches over two months against 455 known, reported, unfixed high-severity vulnerabilities is not a pipeline that is almost ready. It is a pipeline that is failing.
Among the vulnerabilities Mythos has demonstrated it can find: a flaw in wolfSSL that the company said could allow an attacker to forge certificates sufficient to host fake banking or email sites, according to Anthropic's research. The finding illustrates the class of risks these models can enable — real, concrete, immediately exploitable — even if the specific CVE remains under coordination.
The bug bounty economy and the commercial penetration-testing industry were built on the premise that finding vulnerabilities is the scarce resource. Glasswing has found more than 6,200 high-or-critical-severity vulnerabilities across more than 1,000 open-source projects — a volume that exceeds what human researchers can plausibly process. The scarcity has shifted. The question is no longer whether AI can find bugs faster than humans — it demonstrably can — but what happens to an industry and an internet built on the assumption that the rate of discovery was manageable.
The triage bottleneck is the most concrete economic disruption signal in the data. Of 530 high-or-critical bugs reported, 75 patched means 455 known, reported, unfixed vulnerabilities sitting in coordinated disclosure limbo. Not hypothetical. Not theoretical. The attack surface exists, the proof-of-concept exists, and the patch does not. Maintainers are not ignoring the reports — they are drowning in them. Anthropic's own researchers acknowledged the problem: the disclosure pipeline is not broken, exactly, but it is failing at the precise point where a critical capability meets a fragile ecosystem. That is the structural vulnerability the Japan and India directives are responding to.
Anthropic's answer, delivered in the sixth paragraph of a policy update on a Monday, is that it will keep building anyway. The safety company has decided the world needs to catch up.