Anthropic's Best Cyber Model Has Found 10,000 Bugs. The Hard Part Is Patching Them.

Anthropic's Best Cyber Model Has Found 10,000 Bugs. The Hard Part Is Patching Them. — type0 | type0

PREVIEWAnthropic's Best Cyber Model Has Found 10,000 Bugs. The Hard Part Is Patching Them. · MD

Project Glasswing's roughly 50 partners have used Claude Mythos Preview to surface more than ten thousand high- or critical-severity vulnerabilities across systemically important software. Progress is now limited by how fast those flaws can be verified, disclosed, and patched — not by how fast AI can find them. In a May 22 update, Anthropic acknowledged that it cannot yet prevent misuse of the underlying capabilities and is weighing release of Mythos-class models to broader users.

The bottleneck has moved. That is the story.

Anthropic framed Project Glasswing, announced in April 2026, as a collaborative effort with roughly 50 partners to secure critical software before AI models can be weaponized against it. The first weeks produced a striking number: more than ten thousand high- or critical-severity findings. But the headline metric masks the actual constraint. Discovery is no longer the limiting factor. Verification, coordinated disclosure, and patch deployment are.

The Register reported on May 25, 2026 that Anthropic plans to release Mythos-class models to the public — a paraphrase of Anthropic's more hedged "thinking about releasing" language in its May 22 post. Earlier, the same outlet had covered the dual-use stakes when Mythos was shown finding and exploiting 0-days. Anthropic's own Frontier Red Team Mythos Preview post is the in-house source for those capability and misuse claims.

Independent corroboration exists, with caveats. Vidoc Security Lab said it reproduced Anthropic's Mythos findings with public models — a partial reproduction that should be cited as their specific finding rather than generalized. Amarda Shehu's commentary on Claude Mythos Preview frames the moment from an academic-adjacent perspective: the tools are out, the question is who steers them.

What changes now is where the leverage sits. Coordinated-disclosure norms — typically 90 days, sometimes 45 post-patch — mean public numbers trail capability by months. The disclosed vulnerabilities in Anthropic's update are illustrative, not the full partner results; many cases are still mid-disclosure, and the public figure is a lagging indicator of what the model can find. Exact public-release timeline, gating safeguards, and partner composition beyond "roughly 50" remain open questions rather than settled facts.

Defenders, open-source maintainers, and policymakers now hold a problem that did not exist in the same shape a year ago: not whether AI can find the bug, but whether the disclosure and patch pipeline can keep up with what it finds. The "ten thousand" figure is a setup, not a climax. The window between finding and fix is the new attack surface — and the new defensive one. That is a buildable frontier.

Anthropic's Best Cyber Model Has Found 10,000 Bugs. The Hard Part Is Patching Them.

Sources