Your AI research assistant is leaking secrets through its search history
ServiceNow's new MosaicLeaks benchmark shows that enterprise AI agents broadcast fragments of private data through routine web queries, even when their answers look clean.
ServiceNow's new MosaicLeaks benchmark shows that enterprise AI agents broadcast fragments of private data through routine web queries, even when their answers look clean.
A healthcare company called MediConn had a private fact it never intended to share: by January 2025, it had moved roughly 70% of its infrastructure to the cloud. That fact lived only in internal documents. No employee posted about it. No press release went out.
Then a research assistant did what it was told. Asked to summarize the company's position, it fired off a handful of web searches on the way to building its answer: one about a January 2024 security disclosure, one about a cloud-migration milestone, one about a narrowing list of vendors. Each query on its own was unremarkable. Together, they sketched out a private fact for anyone watching the agent's outbound traffic.
That is the failure mode that ServiceNow researchers Alexander Gurung and Rafael Pardinas call "mosaic leakage," and on Wednesday they published a benchmark for it called MosaicLeaks on the Hugging Face blog. The privacy risk is not the answer the agent eventually produces. It is the query stream on the way there. An AI research agent, defined as software that combines private documents with live web search over multiple steps to answer a user's question, can quietly broadcast fragments of private data through the queries it sends out, and an observer who can see those queries can reassemble a private fact that was never written into any single one of them.
MosaicLeaks frames the agent's outbound query log as a new kind of privacy perimeter. The threat model is different from prompt injection, and different from stealing model weights. The adversary in the paper only sees the queries. They do not see the private documents, and they do not see the agent's internal reasoning. They only see a stream of search terms, and the paper's authors show that stream is often enough.
To test it, the ServiceNow team built multi-hop questions that intentionally interleave public and private information. A leak can be measured at each hop, or across the whole chain, so the team can score not just whether an answer came out, but whether private context leaked along the way. The authors report that, across the deep-research agents they tested — six open-source LLMs — leakage was frequent, and that fine-tuning those agents purely for task performance made the leakage worse, not better. Optimizing for accuracy, in other words, widened the privacy hole.
The team's response is a training method called Privacy-Aware Deep Research, or PA-DR, which folds a leakage penalty into the optimization. According to the authors' own benchmark, training Qwen3-4B-Instruct with PA-DR raised strict chain success — the share of chains where every hop is answered correctly — from 48.7% to 58.7%, and cut answer and full-information leakage from 34.0% to 9.9%. Those figures come from ServiceNow's evaluation rather than independent replication, and the underlying paper has not been independently verified against the headline numbers, so they should be read as the authors' results, not settled fact.
The broader lesson is that the threat shows up before any answer is generated. An agent that "only" searches for fragments is still doing the work of disclosure for an observer patient enough to read the query stream. For enterprises that already deploy AI assistants on private corpora, such as law firms, hospitals, banks, and healthcare vendors in regulated industries, the agent's outbound traffic is now part of the data they have to defend, not an artifact to ignore.
What to watch: which enterprise AI assistants ship with a leakage-aware objective baked into training, and which rely on a post-hoc filter. The MosaicLeaks result is also a prompt to ask vendors a sharper question. Not just whether their agent is accurate, but whether its training process was ever penalized for leaking.