Humans evolved specialized brain networks for language, mathematics, physical reasoning, and social thought. Large language models were trained to do one thing: predict the next word in a text. A new MIT preprint finds that the two systems ended up with surprisingly similar internal organizations, the kind of match that is hard to dismiss as coincidence.
That convergence is the real point. The human brain did not evolve one giant, all-purpose thinking region. It grew distinct circuits that handle different kinds of problems: a language system, a formal-reasoning system, networks for navigating the physical world, and others for reading other minds. The MIT group, based at the McGovern Institute and CSAIL, asked whether the same kind of organization emerges inside AI systems that have never been told anything about brains. Their answer, across 46 researcher-designed probe tasks and six frontier models ranging from 24 billion to 123 billion parameters, is yes.
When the team mapped which neurons inside a 32-billion-parameter Qwen2.5 model fired most strongly for tasks in the same cognitive domain, they found the same neurons were reused 4.3 times more often than chance would predict. When they switched those domain-specific neurons off, performance on tasks from that domain collapsed, a 10.3-fold drop relative to the model's general capability. Tasks from other domains, run through the same switched-off network, were barely affected.
Those numbers, drawn from the project's public raw data, describe a measurable internal structure: a model trained on nothing but text is carving out something that looks, functionally, like a brain region.
The result, if it holds up, is not a claim that AI is conscious, sentient, or 'thinks like' a human in any colloquial sense. It is a more modest and more interesting claim. Two entirely different optimization pathways, billions of years of evolution on one side and a few years of gradient descent on next-token prediction on the other, converged on the same architectural solution. Modularity, the authors suggest, may be a general property of intelligent systems, not an accident of biology.
Their method, called attribution probing, is a way of asking which neurons inside a network are most responsible for a given behavior, and at which layer of the network they sit. They constructed 46 scripted probe prompts spanning four domains, language, formal reasoning, physical-world reasoning, and theory of mind, and ran each model through them while tracking which neurons carried the signal. They then compared the resulting depth profiles against canonical brain-network maps from the neuroscience literature. The full code and data are public on GitHub.
Important caveats apply. The work is a 2026 preprint, not yet peer-reviewed. The 46 tasks are researcher-designed probes, not naturalistic conversation, so the result speaks to structured reasoning more than to open-ended chat. The interactive network diagram on the project's demo page is explicitly labeled as a schematic, not a results figure, and the 4.3× and 10.3× numbers come from this specific benchmark suite, not from open-ended use. The convergence claim is hedged by the authors as suggestive, not proven.
What to watch next: independent replication on other model families, and tests of whether the same depth profiles show up in models trained on multimodal data, where the optimization pressure is more varied. If they do, the case that modularity is something intelligence keeps rediscovering, rather than a quirk of either carbon or silicon, gets considerably stronger.