Science's Bottleneck Isn't Ideas—It's Verification
Science is harder to verify than math. This is not a metaphor. A mathematical proof either follows from axioms or it does not; a computer can check it in milliseconds. A scientific claim about how a protein folds, which gene drives a disease, or whether a molecule will bind to a receptor requires running an actual experiment, often in an actual lab with actual reagents. That asymmetry is now the central problem for AI agents trying to automate scientific discovery. Two researchers arriving at the same conclusion from opposite directions — Andrew Beam at Lila Sciences and Marinka Zitnik at Harvard — both land here.
Beam is building what he calls scientific superintelligence: AI systems that generate new knowledge rather than summarize existing knowledge. But he is candid about the bottleneck. A system can propose a drug target, a protein interaction, a gene mechanism — and then what? Verification is expensive, slow, and often impossible without running the experiment. The bottleneck is not generating hypotheses faster. It is checking them.
Zitnik's research quantifies the cost differently. She found that 95 percent of all life sciences publications focus on just 5,000 of the roughly 20,000 known human genes, according to Genetic Engineering and Biotechnology News. That is not a rounding error: 75 percent of the genome has almost no literature to cite, no prior experiments to build on, no community of researchers waiting to validate the next claim. For an AI trained on published science, the sparsely studied genome is a desert. A system that proposes a hypothesis for one of those neglected genes faces a bootstrapping problem — no literature to draw on, and no researcher likely to chase down the suggestion. The verification bottleneck is not just technical. It shapes what science gets done at all.
The reproducibility crisis makes this worse. Nature reported that more than 70 percent of researchers have tried and failed to reproduce another scientist's experiments, and over half failed to reproduce their own results after a few months. AI systems trained on that literature inherit its errors, its irreproducible results, its publication biases. AI makes the problem more urgent because generation is getting cheaper and faster while verification stays expensive.
LabOS, developed by Le Cong at Stanford University and Mengdi Wang at Princeton University, addresses the bottleneck directly: it automates the experimental loop so an AI hypothesis can be tested in a physical lab without a human researcher in the loop. The argument is that closing the loop is what makes AI useful for discovery rather than literature review.
Kosmos (preprint), built by Edison Scientific — the commercial spinout of FutureHouse, backed by former Google CEO Eric Schmidt and led by Sam Rodriques, a former group leader at the Francis Crick Institute — reported that independent scientists found 79.4 percent of statements in Kosmos reports to be accurate. Three discoveries independently reproduced findings from preprinted or unpublished manuscripts; four made novel contributions to the scientific literature. A single 20-cycle Kosmos run performed the equivalent of six months of collaborator research time, reading 1,500 papers and running 42,000 lines of code per run.
Latent-Y (preprint), built by Simon Kohl at Latent Labs, takes a different approach — explicitly designed to work without human filtering. The AI proposes candidates and synthesizes them; no human decides which experiments to run. In a preprint posted to arXiv, the team reported that across nine protein targets, Latent-Y produced lab-confirmed nanobody binders against six of them, achieving single-digit nanomolar binding affinities — high-precision molecular binding that matters in drug discovery. The system completed design campaigns 56 times faster than independent expert estimates, compressing weeks of work into hours. Six of nine targets is a 67 percent success rate, and the system never stopped to ask a human for guidance.
Edison Scientific serves more than 50,000 researchers worldwide, according to Genetic Engineering and Biotechnology News. METR (a self-reported research organization with a track record of optimistic AI forecasting) reported that AI task-length capability has been doubling roughly every seven months. Jensen Huang, chief executive of Nvidia, highlighted OpenClaw at NVIDIA's GTC conference in March 2026 as one of the fastest-growing open-source projects in history.
The verification bottleneck reframes what the trajectory means. It explains why Latent-Y was designed without human filtering — because inserting a human into the loop is precisely what slows the system down. It explains why LabOS emphasizes closing the loop between hypothesis and execution rather than just generating better reports. And it reframes what "scientific superintelligence" actually means: not an oracle that knows things, but a system that accelerates the full loop from hypothesis to wet-lab confirmation.
For the 5,000 least-studied genes, the situation is more acute. There is no literature to cite, no prior results to check against, and no researcher community primed to validate. An AI proposing a hypothesis for one of those genes faces the hardest version of the verification problem: no one can check the claim without running an experiment, and no one is likely to run the experiment. That is what Beam means when he calls verification the central challenge. The bottleneck is not slower than generation. At the frontier of unexplored biology, it may be asymptotically infinite.
The scientists who use AI may phase out the ones who do not, said Rory Kelleher, senior director of healthcare and life sciences at Nvidia, speaking to Genetic Engineering and Biotechnology News. Whether that is a warning or a prediction depends on what happens to the verification bottleneck. The experiments are not going to run themselves — unless, of course, LabOS has something to say about that.
† Complete the sentence with the full attribution and details about the seven discoveries, or provide a direct citation to the source (arxiv.org/abs/2511.02824) for readers to verify independently.
†† Add this Nature article to the registered sources list, or include the specific Nature piece in the article's reference materials for fact-checkers to verify the statistics independently.