Microbes Now Have Fingerprints. Epidemiology Hasnt Been the Same Since.
In 1984, British geneticist Alec Jeffreys noticed something remarkable: certain regions of human DNA varied enough between individuals that a single blood sample could identify a person as uniquely as a fingerprint. The first forensic use came three years later, in a pair of rape-murder cases in Leicestershire, and within a decade DNA profiling had turned forensic science from circumstantial matching into definitive biological attribution. A similar transformation is now reaching epidemiology, and the tool doing it has just arrived in open-source form.
TRACS, published April 24 in Nature Microbiology by Tonkin-Hill et al., stands for TRAnsmision Clustering of Strains. Its core capability is the ability to look at a metagenomic sample — a jumble of genetic material from many organisms in a tissue or environment — and extract, from that noise, the single-nucleotide differences that distinguish two strains of the same microbial species. From those differences it estimates how recently two people likely shared a pathogen. It works on viruses, bacteria, and parasites. The authors validated it against SARS-CoV-2 from UK hospital patients, deep population sequencing of Streptococcus pneumoniae, and single-cell genome data from Plasmodium falciparum malaria patients.
That cross-kingdom range matters. Existing tools for strain-level transmission tracking were built for one pathogen type or required the sample to contain only one species at high enough coverage to assemble cleanly. TRACS handles mixed populations, which is what actual clinical and environmental samples look like. It also adds new samples incrementally without reprocessing the entire dataset, making it suitable for ongoing surveillance rather than retrospective analysis.
The research team — based at the Peter MacCallum Cancer Centre, the Wellcome Sanger Institute, and the University of Oslo — built TRACS partly by thinking about immunocompromised cancer patients. Those patients carry multiple strains of the same species simultaneously, and existing tools could not reliably sort which strain came from which source. TRACS can. That is not a theoretical advance. It is the difference between knowing that a vulnerable patient acquired a pathogen and knowing which of their contacts transmitted it.
One finding from the paper stands out as genuinely surprising: applying TRACS to mother-infant gut metagenome pairs, the team found that Bifidobacterium breve persists in infants far longer than anyone had detected. Previous methods missed it because the presence of multiple co-existing strains confused the analysis. The baby was carrying a strain from mom that nobody knew was there. Whether that persistence matters for infant development is an open question. But it is a concrete example of how poor our resolution has been.
The DNA fingerprinting parallel is worth pressing. In the mid-1980s, courts initially resisted genetic evidence. The science was novel, the error bars were not well characterized, and lawyers argued about population allele frequencies. Strain-level genomic surveillance will face similar resistance in clinical and public health settings. The tools exist. The validation standards, the legal frameworks, and the consent models for tracking which specific microbial lineage moved between two identified individuals do not. That gap is not a reason to dismiss TRACS. It is the story that comes after TRACS.
The code is on GitHub under an MIT license, installable via conda or pip. The dependencies are standard bioinformatics tools — samtools, minimap2, htsbox — available in any sequencing lab. This is not a technology held behind a paywall or a proprietary platform. Which means the question is not whether this capability will spread. It is whether public health institutions, hospital infection control teams, and outbreak response networks can absorb it faster than the legal and ethical frameworks around it can develop.
Epidemiology spent a century learning to track pathogens at the species level. It is now learning to track them at the strain level. The resolution jump is the same one forensics made with DNA fingerprinting, and the consequences — for outbreak investigation, for hospital infection control, for microbiome medicine — will be similarly large.