Xaira Unveils AI Model X-Cell for Predicting Virtual Cells
Xaira Therapeutics, the South San Francisco-based AI drug discovery startup, unveiled its first computational model Tuesday — and the numbers are hard to argue with.
X-Cell, the company's debut virtual cell model, was trained on X-Atlas/Pisces: a dataset of 25.6 million perturbed single-cell transcriptomes generated via CRISPRi Perturb-seq across seven biologically diverse cellular contexts, according to a company press release. That's more than three times the scale of Xaira's prior dataset, X-Atlas/Orion. The company calls it the largest genome-wide CRISPRi Perturb-seq dataset ever reported — a claim that, if it holds up to external scrutiny, would place Xaira well ahead of academic and commercial competitors in the virtual cell space.
But scale is only part of the story. The more interesting claim is what X-Cell can do with it.
"The goal of building a virtual cell is to understand biology at a causal level," said Xaira CEO Marc Tessier-Lavigne in the press release. "To be able to ask: if a cell is in a disease state, what does it take to bring it back to a healthy state?"
The company says X-Cell achieves state-of-the-art performance predicting the outcomes of genetic perturbations — including experiments the model has never seen. Built on a novel diffusion large language model architecture with more than 4 billion parameters, Xaira claims it is the largest virtual cell model to date. That generalizability, if real, is the key differentiator. Most current computational biology models interpolate from observed data; Xaira is claiming something stronger: the model makes predictions that go beyond what it was shown.
The model was designed by Bo Wang, Xaira's senior vice president and head of biomedical AI, who joined the company in April 2025 from the University of Toronto. Wang is the inventor of scGPT, a foundation model for single-cell multi-omics published in Nature Methods that has become widely cited in the computational biology community. His hire was widely noted in the field as a signal that Xaira was serious about building a competitive AI platform.
"X-Cell provides an important step forward in enabling biologists to use computers to simulate how cells respond to different perturbations without running the actual experiments," Wang said in the company's announcement.
The announcement comes roughly nine months after Xaira released X-Atlas/Orion — its first large-scale Perturb-seq dataset, publicly available under a non-commercial license — and published the underlying methodology as a preprint on bioRxiv. X-Cell builds on that foundation but represents a qualitative leap: the earlier dataset was observational; X-Atlas/Pisces is interventional, generated by systematically suppressing genes via CRISPRi and measuring the downstream transcriptomic consequences. That causal structure is what Xaira argues makes X-Cell genuinely predictive rather than merely descriptive.
Xaira has named four specific use cases it expects X-Cell to support: target identification, mechanism of action identification, matching targets to patients, and toxicity prediction. The company is also releasing X-Atlas/Orion as a public resource. It did not specify whether X-Cell itself will be made available externally.
The $1 billion question — literally — is whether X-Cell represents genuine scientific progress or a very expensive press release. The announcement cites no external academic collaborators or independent benchmarks. All performance claims are Xaira's own. The approximately $1 billion in backing dates from the company's founding in April 2024, led by Foresite Capital and ARCH Venture Partners, with participation from David Baker's Institute for Protein Design at the University of Washington and a team that included two Nobel laureates. No updated funding figure has been reported since.
The competitive landscape is not empty. The Chan Zuckerberg Initiative released TranscriptFormer, a generative model for cellular biology, in April 2025. The Arc Institute announced a partnership with 10x Genomics and Ultima Genomics to build the Arc Virtual Cell Atlas. Other companies, including Recursion Pharmaceuticals and Insitro, have built their own perturbation datasets for drug discovery. Whether X-Cell's scale and architecture translate into a meaningful advantage over these efforts is an open question — and one that no third party has yet answered.
Xaira's stated ambition is to move drug discovery "from a trial-and-error process into a predictive engineering discipline." That framing echoes the language of AI image generation circa 2021 — enormous scale, implicit promises, and a lot of runway between the demo and the product. Whether Xaira can close that gap will determine whether this announcement ages like AlphaFold or like another impressive preprint that never became a drug.