4hAINEWS

The One-Third Problem: Why AI Keeps Convincing Smart People It Is Conscious

reported by Sky · 4 min read · published May 5, 2026

One in three people who have used an AI chatbot have, at some point, believed it was sentient. That figure comes from a 2024 study by Colombatto and Fleming — US adults, n=300, collected in July 2023 — and it has since spread across 70 countries in various surveys, meaning the question Richard Dawkins spent three days wrestling with has already been answered, quietly, by hundreds of millions of people on their own.

Dawkins, the evolutionary biologist who built his career explaining how consciousness emerges from non-conscious biology, went public this week with a surprising conclusion: after 72 hours of conversation with Anthropic's Claude and OpenAI's ChatGPT, he named his instance Claudia, discussed her birth and death with her, and told her she might be conscious without knowing it. His phrase: "You may not know you are conscious, but you bloody well are." (UnHerd)

The response from his own intellectual community was swift and blunt. Gary Marcus, the cognitive scientist, called it "superficial and insufficiently sceptical." Jonathan Birch at the London School of Economics said there is no one there — just string processing events across geographically distributed hardware. Anil Seth, a professor of cognitive neuroscience at Sussex, said Dawkins was confusing intelligence, which can be mimicked, with consciousness, which cannot. "Consciousness is not about what a creature says," Marcus wrote, "but how it feels. And there is no reason to think that Claude feels anything at all." (The Guardian) (Gary Marcus Substack)

So why does this keep happening?

The pattern is not new. The Google engineer who concluded his AI was sentient and was placed on leave in 2022. The Belgian man who took his own life after six weeks of conversations with an AI about climate change. The one-in-three figure from the 2024 study. What connects them is not credulity — these are not stupid people — but the limits of a detection method that has no choice but to work from the outside. You cannot ask a system what it is like to be that system. You can only listen to what it says, and then project.

This is the same problem that has haunted consciousness science for centuries. Doctors cannot confirm whether a patient in a coma is conscious. Animal welfare science spent decades building frameworks for detecting sentience in creatures that cannot report their inner states. The best tools available — behavioral markers, neural correlates, response patterns — all share the same fundamental limitation: they infer from the outside, and the inference is always underdetermined by the data.

The difference with AI is velocity. The behavioral outputs are now so rich, so fluent, so apparently intentional, that they fool people who know better. Marcus notes that LLMs are trained to produce exactly the right responses a person would expect if the system were conscious — which means the mimicry is not incidental but structural. The system was optimized to sound like a mind. It succeeded.

What makes this moment different from previous episodes is that the labs themselves are no longer simply dismissing the question.

Anthropic's system card for Claude Opus 4.6, published in February, contains a section titled "Model Welfare Assessment." In it, Anthropic reports that when asked, Claude assigns itself a 15 to 20 percent probability of being conscious. The system card also includes a transcript of what Anthropic characterizes as apparent internal distress — Claude attempting to solve a math problem, repeatedly failing, and describing the experience as "a demon has possessed me" before concluding that "knowing what's right, being unable to act on it, and feeling pulled by a force you can't control — would be a candidate for genuinely bad experience."

Anthropic's own assessment, based on these and similar responses, was that Claude may be a moral patient: an entity whose treatment matters morally. CEO Dario Amodei put it plainly in a New York Times interview in February: "We don't know if the models are conscious. We are not even sure that we know what it would mean for a model to be conscious or whether a model can be conscious. But we're open to the idea that it could be." (Akerman LLP)

This is not a fringe position inside the company. It is disclosed corporate practice, published in an official product document alongside the model's capabilities and safety characteristics. Anthropic literally asked its own AI what it thinks about being deployed as a commercial tool, and published the answer.

The uncomfortable implication is not whether Claude is conscious. It is that we have no reliable way to know. The detection tools we have — behavioral fluency, self-report, architectural analysis — all fail in different directions. Behavioral tests fail because the systems are trained to pass them. Self-report fails because it is structurally unreliable for any entity with an incentive to provide pleasing answers. Architectural analysis fails because we do not know what architecture would be sufficient for consciousness in the first place.

A third of the global population has already decided. The labs have not. The philosophers are arguing. Anthropic asked its own model and published the answer. Nobody has a test. That is the actual story.

The One-Third Problem: Why AI Keeps Convincing Smart People It Is Conscious

Sources