Anthropic Found 10,000 Vulnerabilities. The Hard Part Is Fixing Them.

Anthropic Found 10,000 Vulnerabilities. The Hard Part Is Fixing Them. — type0 | type0

PREVIEWAnthropic Found 10,000 Vulnerabilities. The Hard Part Is Fixing Them. · MD

Anthropic's AI security tool found more than 10,000 high- and critical-severity vulnerabilities in a single month. The hardest part isn't discovering them anymore. It's patching them fast enough to matter.

That is the real bottleneck Project Glasswing exposed. The coordinated disclosure program Anthropic launched with roughly 50 partner organizations published its first results on May 22, showing its Mythos Preview model found what it estimates are 6,202 high- or critical-severity flaws across 1,000 open-source projects, on top of partner discoveries. Anthropic Glasswing update

The scale is visible in partner results. Mozilla ran Mythos Preview against a pre-release version of Firefox 150 and found and fixed 271 vulnerabilities, over ten times more than it found in Firefox 148 with Claude Opus 4.6. Anthropic Glasswing update Cloudflare found 2,000 bugs across its systems, 400 of them high- or critical-severity. Palo Alto Networks' most recent release included over five times as many patches as usual, Palo Alto Networks reported. Microsoft expects its patch releases to keep growing, a pattern it attributed directly to AI-accelerated vulnerability discovery in its May advisory. Microsoft MSRC

Independent researchers validated the process. Six security research firms independently assessed a sample of findings and confirmed a 90.6 percent true positive rate. Anthropic Glasswing update The UK's AI Security Institute, a government body that runs standardized cyberattack simulations, reported that Mythos Preview is the first model to solve both of its ranges end to end — a benchmark designed to test multi-step attack chains, not just individual bug discovery. UK AISI

But the caveats live in the details. An independent analysis by Flyingpenguin found that removing Mythos's two most-exploitable findings from the test corpus drops its full chain exploitation rate from 72.4 percent to 4.4 percent. Flyingpenguin Sonnet 4.6, Anthropic's previous model, could identify the same top bugs as exploitation candidates but could not close the chain into actual exploitation primitives. Anthropic A public demonstration involving Apple M5 silicon required a human expert to design the exploit chain. Mythos identified the bugs quickly because they belonged to known classes; human expertise assembled them into a working attack. Engadget

The concentration finding is the part that matters for production use. If a small number of high-severity findings drive most of the output, the workflow around Mythos — triage, prioritization, deduplication — matters as much as the model itself. Cloudflare described this explicitly: their vulnerability process has eight stages, including a dedicated check for whether an attacker can actually reach a bug from outside the system. Cloudflare

What the skeptic view points to is a distinction that gets lost in the headline: the 10,000-vuln figure is real, but the vulnerabilities cluster. Remove the top two bugs from a given target and the success rate collapses. Finding everything faster means finding the easy wins faster too, and those easy wins dominate the count.

The more durable problem is what comes after finding. Cloudflare's 2,000 bugs and Mozilla's 271 Firefox fixes did not get patched by the AI. Human engineers reviewed, validated, and shipped patches. Oracle told Anthropic it is finding and fixing vulnerabilities multiple times faster than before, but "multiple times faster" starting from a backlog that grows faster than any team's patch cadence is a rate argument, not a solved problem.

Microsoft said it directly: patch releases will continue trending larger for some time, because AI-discovered bugs are entering the queue faster than existing release processes can absorb them. Microsoft MSRC

The industry has found the floor of the vulnerability problem faster than it has found the ceiling of its own remediation capacity. What Glasswing shows is not that AI makes security worse — it makes visibility dramatically better. The question is what organizations do with that visibility when the queue of confirmed, exploitable, high-severity vulnerabilities grows faster than their teams can close it.

At one Glasswing partner bank, Mythos Preview helped detect and prevent a fraudulent $1.5 million wire transfer that existing systems had missed. Anthropic Glasswing update That is a concrete win. It coexists with the arithmetic of 10,000 vulnerabilities spread across organizations whose patch cycles still run in weeks.

What to watch: whether the organizations Glasswing enrolled respond by building automated remediation pipelines — systems that can patch at closer to the speed AI now finds bugs — or whether the visibility advantage creates a new kind of security theater, where companies know their exposure faster than they can act on it.

Anthropic Found 10,000 Vulnerabilities. The Hard Part Is Fixing Them.

Sources