David Silver just raised $1.1 billion on a bet that the AI research community has already published evidence contradicts.
The former DeepMind researcher, the architect of AlphaGo and AlphaStar, announced Monday that his startup, Ineffable Intelligence, has raised the largest seed round in European history: $1.1 billion at a $5.1 billion post-money valuation, co-led by Sequoia and Lightspeed Venture Partners, with participation from Nvidia, Google, DST Global, Index, and the UK's Sovereign AI Fund, according to CNBC and the law firm Cooley which confirmed the round. His approach: pure reinforcement learning, AI that teaches itself through trial and error in simulated environments rather than by ingesting human-generated text. "The mission is to make first contact with superintelligence," he told WIRED. "We are creating a superlearner that discovers all knowledge from its own experience."
A paper presented at NeurIPS 2025 as an Oral — "Does Reinforcement Learning Really Incentivize Reasoning Capacity in LLMs Beyond the Base Model?" — found that current reinforcement learning with verifiable rewards (RLVR) methods do not push AI systems beyond what their base models already know. The base model acts as an upper bound. The researchers tested six popular RLVR algorithms across math, coding, and visual reasoning benchmarks and found that while RL-trained models outperform their base models at small sample counts, the base models outperform them at larger samples. "Our findings suggest that current RLVR methods have not yet realized the potential of RL to elicit truly novel reasoning abilities in LLMs," the authors concluded. They speak directly to the question of whether Silver's approach can deliver what he is promising.
Silver's argument for why LLMs will plateau is illustrative. In an interview with WIRED, he offered a thought experiment: imagine releasing a large language model into a world that believed the Earth was flat. Without the ability to interact with reality, the system would remain a committed flat-earther regardless of how much it was fine-tuned or scaled. That, Silver argues, is the core limitation of learning from human data. You cannot transcend what humans have already figured out.
His proposed alternative draws on his experience building AlphaGo, which learned to play Go from scratch by playing against itself, eventually producing moves no human player would have conceived — including the famous Move 37 that defeated Lee Sedol. Silver wants to apply that same self-teaching principle to general intelligence, placing AI agents inside simulations where they can discover knowledge through experience rather than imitation.
Ravi Mhatre, a partner at Lightspeed, which co-led the round, frames the bet in terms of trajectory rather than evidence. "Silver's career is basically a single, coherent argument for being able to scale intelligence without human priors," he told WIRED. Mhatre said he pressed Silver on safety and believes the simulation-based approach offers a better path to aligned AI, because behavior can be observed rather than inferred from human data.
What is notable, and what the funding figure obscures, is that the research community has not reached consensus on whether the ceiling Silver is betting against actually exists — or whether his specific version of reinforcement learning can circumvent it. The NeurIPS paper addresses RL applied to existing language models. Ineffable Intelligence is not building on top of an LLM; it is starting from scratch with pure RL in simulated environments. Whether that distinction matters is the open question. The company has not published a technical description of its methodology beyond the broad vision.
Silver is donating the equity proceeds from Ineffable Intelligence to charity — a sum that could amount to billions if the company succeeds, according to WIRED. He left DeepMind, where he led the reinforcement learning team from 2013 to 2026, according to his Wikipedia profile, because he wanted to pursue this approach without it being "just a corner of another place dedicated to LLMs," he told WIRED.
The AI industry's response to the funding will be telling. If investors and researchers treat Silver's raise as validation that the LLM scaling path has a ceiling, the repricing of every LLM-centric lab — OpenAI, Anthropic, DeepMind, Meta — begins immediately. If they treat it as a well-funded long shot from a brilliant researcher with a personal theory, the story is different. What the NeurIPS paper makes clear is that the question is no longer theoretical. There is a published answer. Whether it applies is what $1.1 billion is designed to find out.