Two weeks ago, a preprint appeared on bioRxiv with a narrow finding: applying a glycosylation filter, a way of accounting for sugar molecules on the surface of a virus, to a public dataset of AI-designed proteins turned up biological signals that existing computational scores had missed. The dataset was not new. It came from a protein design competition run by Adaptyv Bio against the Nipah virus, a pathogen that kills up to 75 percent of the people it infects and has no approved treatment. What was new was the question the preprint authors asked of it.
But the more interesting question the competition structured has still not been answered. Adaptyv tested three parallel tracks: 600 designs chosen by Boltz-2, an AI model; 400 selected by a panel of protein design experts; 200 from a community vote. All 1,196 designs were screened against the same Nipah target in the same wet lab. The per-pool success rates are sitting on Proteinbase, open for anyone to download. Whether the AI track outperformed the expert panel is the experiment the dataset was built to run, and nobody has run it.
The competition itself ran in early 2026. It generated 111 binders, with 26 hitting single-digit nanomolar affinity, meaning they stuck to the Nipah glycoprotein with enough strength to be worth developing further. One team, Escalante.bio, got 9 out of 10 designs to bind, the highest hit rate in the competition. They did it by targeting the stalk domain of the viral protein, the base that most competitors ignored in favor of the more accessible head. It was a bet on structure over consensus, and it paid off.
The preprint is the most recent reanalysis of this dataset, but it will not be the last. Open competition data that anyone can download is a different kind of scientific infrastructure: it turns a single experiment into a recurring one, where every researcher with a hypothesis can test it against the same 1,196 benchmarked designs. The competition answered which proteins bind. The dataset keeps producing new questions.
What the data cannot yet answer is which approach, AI or human expertise, produces better starting points for drug development. The glycosylation preprint authors looked at the dataset through the lens of sugar chemistry. The AI-versus-experts comparison is still waiting. That is the question the open dataset was built to answer.
Nipah spreads from fruit bats to humans and transmits between people in close contact. It has caused repeated outbreaks in South and Southeast Asia and is on the WHO's list of pandemic threats. Countermeasures are sparse but not absent: ServareGMP has a monoclonal antibody entering trials in an affected country in 2026, and a VSV-vectored vaccine candidate from Public Health Vaccines entered Phase 1 testing in early 2026, with a separate Moderna mRNA vaccine candidate publishing Phase 1 results in Nature Medicine in March 2026. The competition added 111 novel binders to the starting lineup. What happens next is a biology problem, not a computational one.
Sources: bioRxiv preprint (April 16, 2026); Proteinbase; Escalante.bio blog; Adaptyv Bio X; WHO DON593; Nature Medicine; Gavi.