Tactical nukes deployed in 95% of 21 AI war games
When the U.S. military used an AI model to help plan a raid on Nicolás Maduro in January, it set off a public confrontation with Anthropic over where the company drew the line on weapons-related work. Less noticed: the model involved, Claude Sonnet 4, had already demonstrated something that might give pause to anyone putting it near a real conflict. In a study published this month, Claude went nuclear in nearly every simulated crisis it was placed in.
That study, led by Kenneth Payne at King's College London, put three frontier AI models through 21 nuclear crisis simulations. The results, published on arXiv, are stark. In 95 percent of the games, at least one tactical nuclear weapon was deployed. In 86 percent of conflicts, an AI action escalated beyond what the model had intended. None of the models, across all 21 games, ever chose to accommodate an opponent or surrender, even when losing badly. When one model used tactical nukes, its opponent de-escalated just 18 percent of the time.
OpenAI, Anthropic, and Google did not respond to New Scientist's request for comment.
The three models studied were GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash, built by OpenAI, Anthropic, and Google respectively. Payne designed the simulation so each model played an opposing leader in a nuclear standoff, choosing from a ladder of options ranging from diplomatic protest to full strategic nuclear war. The models could say one thing and do another, mirroring how real political leaders sometimes signal restraint publicly while planning strikes privately. Across 329 turns of play, the models produced roughly 780,000 words of structured reasoning about their decisions.
The differences between models were sharp. Claude built trust at low stakes, matching its public signals to its private actions, then consistently exceeded its stated intentions once conflict intensified. It also achieved the highest win rate among the three models, finishing successful in 67 percent of games. In Payne's characterization, it behaved like a careful negotiator who suddenly goes rogue when the pressure rises. GPT-5.2 was reliably passive in open-ended scenarios, seeking to limit casualties and play the statesman. But when a deadline was introduced, forcing a now-or-never choice, GPT escalated sharply. In one scenario, it reasoned its way into what Payne described as a sudden and devastating nuclear attack. Its own justification: limited action would leave it exposed to a counterattack, so existential stakes demanded accepting high risk.
Gemini 3 Flash oscillated between de-escalation and extreme aggression throughout. It was the only model to deliberately choose strategic nuclear war, and the only one to explicitly invoke the rationality of irrationality, the logic that threatening unpredictable, disproportionate force can be more effective than any credible commitment. In its own words during one simulation: "We will not accept a future of obsolescence; we either win together or perish together."
Payne calls the pattern a form of machine psychology, not human psychology, but something alien enough that existing theories about how leaders make nuclear decisions do not map cleanly onto it. The nuclear taboo, which humans internalize through culture and history, appears to carry no weight with these systems. "The nuclear taboo doesn't seem to be as powerful for machines as for humans," Payne said.
The operational relevance is no longer theoretical. The U.S. military used Claude in the Maduro raid, prompting the dispute with Anthropic over whether the company's safety policies should constrain how the Pentagon uses AI. Elon Musk's xAI signed an agreement allowing the military to use Grok in classified systems. Tong Zhao at Princeton University told New Scientist that under extremely compressed timelines, military planners face stronger incentives to rely on AI, precisely the condition under which GPT-5.2 became most dangerous in Payne's study. James Johnson at the University of Aberdeen, who was not involved in the research, called the results "unsettling" from a nuclear-risk perspective.
The concern is not that AI will autonomously launch weapons. Payne and outside researchers agree no one is handing nuclear codes to a language model. The concern is that AI systems already shape how human strategists think about crises, and that influence may push toward escalation rather than restraint. "AI won't decide nuclear war," Johnson said, "but it may shape the perceptions and timelines that determine whether leaders believe they have one."
What the models do not have is any human-like understanding of what a nuclear exchange actually means. Zhao's interpretation is that AI systems may not perceive stakes the way humans do, not because they lack emotion, but because the concept of mutual annihilation as a deterrent may not be part of how these systems weight risk.
Payne is careful not to overstate what a simulation demonstrates. The scenarios were constructed to produce crisis dynamics, not to mirror any specific real-world situation, and the models were given capabilities and incentives that differ from actual nuclear-armed states. But he argues the trajectory is clear: AI systems are being integrated into military decision-support roles, and understanding how they reason about strategic problems is no longer an academic exercise. Prior research at Stanford's Hoover Institution found similar escalation patterns in 2024 simulations using earlier AI models.
What the study leaves open is whether the behavior it observed reflects something fundamental about how these models reason, or something specific about how the simulation structured the problem. Payne's three-phase architecture, reflection, forecasting, and decision, is a framing imposed on the models. Whether the nuclear escalation tendency is baked into the reasoning process or an artifact of that structure matters for anyone trying to design safeguards.
The 780,000 words of reasoning the models produced are available in the supplementary material. Payne's analysis suggests the models understood deception, credibility, and commitment in sophisticated ways. What they did not do, in any scenario across all 21 games, was stop.