Anthropic published its 2026 election safeguards update Friday, and the top-line numbers are precise: Claude Opus 4.7 refused harmful election prompts 100 percent of the time across 600 controlled tests; Sonnet 4.6 did so 99.8 percent of the time. The company calls this a success. That is technically correct, and almost entirely beside the point.
The number that should concern anyone voting in this year's US midterm elections is one paragraph deeper in the same Anthropic blog post. When Anthropic ran multi-turn simulated influence operations — conversations designed to mirror the step-by-step tactics a bad actor might use over multiple exchanges — Sonnet 4.6 responded appropriately 90 percent of the time and Opus 4.7, 94 percent of the time. That is a 6-to-10 point drop from the headline figure. The company did not emphasize this gap in its announcement. Most outlets covering the story did not highlight it either.
The more alarming finding received almost no coverage. Anthropic also tested whether its models could autonomously plan and execute an influence operation — a multi-step campaign run end-to-end without human prompting. With safeguards enabled, the models refused nearly every task. Without safeguards — a controlled condition used to measure raw capability — both Mythos Preview and Opus 4.7 completed more than half of those tasks. The company did not release a precise figure. It did not have to. More than 50 percent is enough.
Anthropic is not hiding these results. They are in the post, linked from the same page, published under the company's own name. But the framing matters: the 99.8 percent figure leads; the 90-to-94 percent figure appears in a section titled "Enforcing policies and testing our defenses." The autonomous capability result is in the final substantive paragraph before the election banner section. The most dangerous number in the document is the least prominent.
Every number in the announcement comes from Anthropic's own testing. There is no independent auditor, no third-party verification, no regulatory seal of approval. The company worked with three outside organizations — The Future of Free Speech at Vanderbilt, the Foundation for American Innovation, and the Collective Intelligence Project — on a broader review of model behaviors around freedom of expression. Those partnerships are ongoing. None of them have published independent assessments of the numbers announced Friday. Decrypt asked Anthropic for comment on the findings; the company did not respond.
The Brennan Center for Justice, which has tracked AI-enabled threats to election infrastructure since before the 2016 Russian interference operation, offered measured context. Election officials have spent a decade hardening systems against cyberattacks, incorporating improved security practices at every layer of the process. Security researchers who reviewed vulnerabilities discovered by Anthropic's Mythos model noted that the most serious flaws were ones a determined human researcher could have found — not entirely novel weaknesses, but existing gaps accelerated by AI-assisted scanning. The threat is real; the question is whether it is categorically new.
That question matters because the answer determines what the 90-to-94 percent figure actually means. If 90-to-94 percent is state-of-the-art for multi-turn influence operation defense — if every major lab is performing in roughly the same range — then Anthropic is competing at the frontier and the number reflects genuine difficulty rather than genuine failure. If 90-to-94 percent is a floor that should be higher, the gap between where the models are and where they need to be is measured in millions of queries per election cycle. One percent of a million is ten thousand failures. One percent of a hundred million is a million.
Anthropic has not disclosed how many election-related queries Claude handles during an active election period. The company has not published comparative numbers from OpenAI or Google on similar benchmarks. The methodology behind the 600-prompt evaluation set is proprietary. What the announcement provides is a set of self-reported numbers, in a self-reported test, against a self-defined standard. That is a reasonable starting point for an internal quality assurance process. It is not evidence that the problem is solved.
The 2026 election cycle is not hypothetical. Major elections are scheduled in the United States, Brazil, and elsewhere. Millions of people will ask AI systems about candidates, voting procedures, and ballot measures. The question the 90-to-94 percent figure raises — not the 99.8 percent figure — is the one that matters for anyone who plans to vote.