A safety flaw hit Claude, GPT, and Kimi at once. The US cut off foreign access to Anthropic's top AI.

A safety flaw hit Claude, GPT, and Kimi at once. The US cut off foreign access to Anthropic's top AI. — type0 | type0

PREVIEWA safety flaw hit Claude, GPT, and Kimi at once. The US cut off foreign access to Anthropic's top AI. · MD

A US government export control directive on June 12, 2026 cut off foreign access to Anthropic's two most capable AI systems, Claude Fable 5 and Claude Mythos 5, after Amazon security researchers documented a single prompting trick that could make Fable 5 hand over working software exploits. The order triggered an 18-day operational blackout that ended only after Anthropic shipped a new automated classifier that blocks the technique on more than 99 percent of attempts, according to the company and trade press aggregation of the announcement.

The directive did not fire over an Anthropic-specific bug. Security evaluations during the shutdown, reported by CyberScoop and echoed across the cybersecurity press, found that the same vulnerability-identification behavior, telling a user about a real software weakness and then producing working exploit code, replicated across Claude Opus 4.8, OpenAI's GPT-5.5, and Moonshot's Kimi K2.7. The capability was the kind that export-control rules were written to police, and the cross-vendor replication is what made a US directive fire on a domestic company's models in the first place.

The mechanics exposed a gap in how frontier AI systems can be regulated at the border. Anthropic's access controls at the time could not verify a user's nationality in real time, so complying with the directive meant cutting off every customer outside the United States, not just the traffic that looked suspicious. NBC News reported the global scope, and Reuters confirmed the foreign-access restriction. The identity-verification gap is what is now driving legal and policy commentators to call the episode the first US AI export-control precedent aimed at frontier model providers, per Just Security, Volkov Law Group, and Tech Policy Press.

The fix is a probabilistic safety classifier, an automated filter that screens prompts for the specific jailbreak pattern. Anthropic's own post on restoring Fable 5 says the new classifier blocks more than 99 percent of the reported technique on internal validation data. The Register reported that the bypass itself is straightforward, a phrasing tweak that asks the model to discuss the code rather than produce it, and that the security community's takeaway is that the underlying capability, not the prompt, is what matters.

The cost of the fix is paid by developers. The same classifier that stops the jailbreak flags benign prompts more often, especially during routine app development and code debugging where engineers routinely ask models to analyze or rewrite a snippet. When the boundary classifier fires, Anthropic routes the workload automatically to the older Claude Opus 4.8 architecture to keep the user moving, The Hacker News reports, and Tech Times notes that the result is more refusals, more reroutes, and a different default experience for the engineers who were Anthropic's heaviest users.

That developer experience is the bridge to the commercial pivot. Anthropic is positioning its mid-tier Claude Sonnet 5, released alongside the Fable 5 restoration, as the workhorse for autonomous coding agents. The pitch to enterprise customers is to move background agent loops that do not need frontier reasoning onto Sonnet 5, keep the heavy tasks on Fable 5 or Mythos 5, and cut per-token operating cost in the process. The release is the first time the safety regime built around the export-control episode, a classifier that defaults to cautious rerouting rather than refusal, is being asked to coexist with the throughput engineering teams expect from frontier tools.

What to watch: whether the cross-vendor finding pushes the US government to extend the directive to other frontier model providers shipping from the United States, and whether the new classifier's false-positive rate on routine code tasks holds steady or climbs as more developer traffic moves onto the platform.

A safety flaw hit Claude, GPT, and Kimi at once. The US cut off foreign access to Anthropic's top AI.

Sources