Microsoft UniRG Uses Reinforcement Learning to Build Better Medical Image Report Generators
New framework uses RL to train AI models that generalize across hospitals and patient populations, addressing a persistent weakness in current medical imaging AI systems.

Microsoft UniRG Uses Reinforcement Learning to Build Better Medical Image Report Generators
Microsoft Research has unveiled a new framework called Universal Report Generation (UniRG) that uses reinforcement learning to train medical imaging AI models that generalize better across different hospitals and patient populations — a persistent weakness in current systems.
The approach addresses a fundamental problem: models trained on data from one hospital often fail when applied elsewhere because radiology reporting practices vary widely between institutions. A model might learn specific phrasing conventions from one hospital's reports rather than general clinical patterns, causing it to perform well on training data but poorly on unseen institutions.
"Current models struggle because reporting practices vary widely among providers," Microsoft noted. "A model trained with supervised fine-tuning on one set of data may learn its specific phrasing and conventions instead of more general patterns."
UniRG combines supervised fine-tuning with reinforcement learning that optimizes a composite reward integrating rule-based metrics, model-based semantic metrics, and LLM-based clinical error signals. This allows the model to learn from diverse data sources and develop representations that generalize across institutions, metrics, and clinical contexts.
The resulting model, UniRG-CXR, was trained on over 560,000 studies, 780,000 images, and 226,000 patients from more than 80 medical institutions using data from MIMIC-CXR, CheXpert Plus, and ReXGradient-160k.
According to Microsoft, UniRG-CXR is the first report generation model to achieve consistent state-of-the-art performance across report-level metrics, disease-level diagnostic accuracy, cross-institution generalization, longitudinal report generation, and demographic subgroups. It currently ranks #1 on the ReXrank leaderboard for chest X-ray interpretation as of January 22, 2026, surpassing prior best models by "substantial margins."
The model also demonstrates strong longitudinal capabilities — it can incorporate prior exam data to track whether conditions are improving, worsening, or unchanged, moving closer to how radiologists actually reason through patient histories.
Important caveat: Microsoft explicitly states this work "is a research prototype intended to advance medical AI research and is not validated for clinical use."
Sources
- microsoft.com— Microsoft Research Blog
- microsoft.com— Microsoft Research Publication
- rexrank.ai— ReXrank Leaderboard
- arxiv.org— arXiv Paper
Share
Related Articles
Stay in the loop
Get the best frontier systems analysis delivered weekly. No spam, no fluff.
