The FDA has spent a decade clearing radiology AI that does one job: point at a suspect region on a scan and let a radiologist decide. In March and June 2026, the agency took a different step. It gave breakthrough device status to two systems that no longer just point. They write.
That shift is bigger than the announcements suggest. Vision-language models from Mosaic Clinical Technologies, marketing its Cognita system, and Aidoc's First Read are designed to produce full-text radiology findings for a human reviewer to sign off on. Mosaic announced Cognita's designation on March 4, 2026. Aidoc's arrived on June 25, 2026, reported by STAT. Both clear a regulatory bar that was built for detection tools, and that is where the structural problem shows up.
The standard FDA pathway for radiology AI is the 510(k) clearance, where a vendor shows its algorithm matches the performance of a known device on a labeled dataset. A bounding box that says "possible pulmonary embolism" can be checked against thousands of CT scans where the answer is already known. A vision-language system that outputs a sentence like "1.2 centimeter spiculated nodule in the right upper lobe, recommend follow-up" has no equivalent fixed ground truth. Two radiologists can read the same image and produce different reports; asking a generative model to match either of them is closer to grading an essay than scoring a multiple-choice test.
Breakthrough designation is not approval. It is a faster review lane for tools that target an unmet need, and most products that earn it still face a full clearance decision later. The risk for vendors, and the more interesting story, is what happens then. The 510(k) framework will be asked to evaluate free-text outputs, and there is no settled methodology for doing that at scale. Mosaic holds the breakthrough designation as the regulatory sponsor for Cognita, a system built by Stanford researchers and acquired by Radiology Partners late last year per STAT. Aidoc separately holds an earlier FDA clearance for what it calls a "comprehensive foundation model" for radiology, giving it a dual position that spans detection and generative products.
The question that follows both announcements is what counts as ground truth for a system whose output is prose. Academic radiology has long studied inter-reader variability, and pivotal studies for these systems will need to define their own reference standards: panel agreement, consensus reads, or expert adjudication. Each choice embeds a different definition of accuracy. Aidoc's First Read is described in the company announcement as targeting four life-threatening findings on CT in acute care settings, which means a pivotal study will need to define what "ground truth" means for each of those four categories. If that benchmark is a panel of radiologists rather than a fixed dataset, every clearance will inherit whatever disagreements the panel has, and every model update will become a claim about how those disagreements were resolved.
That is the durable read. The two announcements are the news. The methodology question is what the news actually changes, and it will follow these products into the FDA's review queue for years.