Xaira Therapeutics thinks it has found the secret to predicting how cells respond to drugs: go bigger. The San Francisco-based startup, armed with roughly $1 billion in backing and co-founded by Nobel laureates David Baker and Carolyn Bertozzi, on March 17, 2026 released what it calls the first demonstration of a scaling law in virtual cell modeling, according to a BioSpace press release. The model, called X-Cell, reaches 4.9 billion parameters — trained on a dataset of 25.6 million perturbed single-cell transcriptomes across seven biologically diverse cellular contexts, the largest such atlas assembled to date, according to the company. The preprint abstract distinguishes 16 biologically diverse contexts across those seven screens. The work is posted as a preprint to bioRxiv and has not yet been peer reviewed.
The result, according to Marc Tessier-Lavigne, Xaira's chief executive and a former president of Stanford who previously served as chief scientific officer at Genentech, is a model that can predict how a cell will react to a perturbation even in biological contexts it has never encountered. In tests, X-Cell demonstrated zero-shot generalization to unseen iPSC-derived melanocyte progenitors and primary human CD4+ T cells from multiple donors — suggesting the model's predictions aren't artifacts of data it already saw. On one key metric, Pearson Δ, X-Cell outperformed existing state-of-the-art models by up to fivefold, the company reported.
It is a genuinely impressive technical result. Whether it settles the debate about where virtual cell models should go next is another question entirely.
A competing preprint, also posted to bioRxiv, offers a colder verdict. The 2025 Virtual Cell Challenge — an independent benchmark exercise in which teams submit their models to standardized biological prediction tasks — found that no single model dominated across metrics. The organizers introduced a Generalist Prize specifically to recognize models that performed robustly across the widest range of contexts, an implicit acknowledgment that scale alone doesn't determine whether a model is useful in practice. The challenge, in other words, didn't produce a winner on the basis of size.
Ron Alfa, chief executive of Noetik, a virtual cell startup that competed in the 2025 challenge, put the critique more bluntly. Building up layers of cell-level experiments to achieve a tissue — and ultimately a human representation — is a challenging feat, he told GEN. He argues that developing models which tokenize patient tissue directly is a more convincing translational approach than scaling up perturbations in immortalized cell lines. Xaira's context is seven cellular environments. Human biology involves hundreds.
The tension is not abstract. Where the field invests its next billion dollars in compute depends partly on which theory of virtual cell progress turns out to be correct. The scaling law hypothesis — borrowed explicitly from large language models, where adding more parameters and data reliably improves performance — predicts that X-Cell's advantage will compound as the company adds more perturbations, more cell types, and more parameters. The competing view holds that context diversity matters more than sheer scale: a model trained on a wider range of tissue and patient backgrounds will outperform one trained on more perturbations of the same seven cell types, even if the latter has more parameters.
Xaira's answer, so far, is to keep scaling. The company is upfront about what it's selling: a platform that can, in theory, predict drug mechanisms of action across therapeutic areas without requiring a new experiment for every cell type. That is a plausible and commercially valuable thing. Whether the scaling law that holds for language holds for the messy, context-dependent business of cell biology is a question that won't be answered by a preprint.
What the Virtual Cell Challenge 2025 results make clear is that the field is not yet close to a consensus. X-Cell is the largest model and it posts strong numbers on a specific benchmark. It is also the case that the benchmark with the most diverse set of evaluation conditions didn't produce a single dominant model. Both things can be true. Investors and drug hunters who are looking to the virtual cell space for a clear signal will have to sit with that ambiguity for a while longer.
Xaira says it plans to use X-Cell internally for drug discovery partnerships and to release model weights for academic research — a familiar playbook in AI biology. The preprint is available on bioRxiv. Tessier-Lavigne's background, including his tenure at Stanford and Genentech, is publicly known. His prior work on APP/DR6 signaling in axon pruning and neuron death that was retracted from Nature in 2023 has been reported on separately and the company has not addressed it in X-Cell materials.
What to watch next: whether academic groups reproduce X-Cell's zero-shot results in primary human tissues outside the cell types Xaira tested, and whether the 2026 Virtual Cell Challenge updates its benchmark suite to include more diverse patient and tissue-level contexts. The scale versus context debate isn't resolved. It may not be resolvable until the models start failing or succeeding in actual drug programs.