D&B Rebuilt Its 642M-Company Graph for AI Agents. The Hard Part Was Identity.
Dun & Bradstreet has spent 180 years building a database of businesses. Its Commercial Graph covers 642 million companies, 11,000 fields per record, and runs roughly 100 billion data quality checks per month as information moves through the system — reliably serving nearly 200,000 customers globally, according to VentureBeat. Then AI agents started querying it, and everything broke.
The problem was not the data. It was the architecture underneath it.
As Sean Michael Kerner reported for VentureBeat, D&B's Commercial Graph was a collection of separate systems built for different markets and use cases, held together by custom integrations. Human analysts navigated that fragmentation through SQL queries or pre-built interfaces. They could wait for results. They could work around ambiguous entity matches. AI agents cannot do any of those things.
"We need to think about agents as our new consumer category, evolving from our standard credit analysts or sales and marketing professionals, et cetera, to also now catering to these customers' agents," Gary Kotovets, Chief Data and Analytics Officer at Dun & Bradstreet, told VentureBeat.
The underlying data had also nearly doubled in five years, expanding from over 300 million to more than 642 million business records — as VentureBeat reported. Querying that at sub-second latency, against a fragmented architecture, was not workable.
So D&B rebuilt.
What they actually built
The consolidation involved migrating fragmented databases to cloud infrastructure, redesigning the underlying schema, and building a data fabric layer that normalizes records across markets while preserving regional compliance requirements. The result is a unified knowledge graph tracking billions of relationships across 642 million companies, continuously updated by AI-driven processing.
On top of that graph, D&B built a structured access layer for agents. Raw SQL access at agent query volumes and latency requirements was not the answer. Instead, D&B created a set of tools and skills available through Model Context Protocol (MCP) — the open standard that lets AI agents share tools across different model providers — that package data with context and route agents to the right records for specific queries. A match and entity resolution engine sits behind every query, confirming that when an agent asks about a company, the answer resolves to a verified, specific entity rather than a name match.
That solved the data retrieval problem. It did not solve the identity problem.
The Know Your Agent problem
Agents are not humans, and the authentication model built for human users did not extend to machines. D&B built a new registration model: agents must map to a verified IP address and register an individual access key, treated as an authenticated identity in the same pipeline as a human user.
"We actually have a concept of Know Your Agent, similar to know your customer, that does those additional verifications," Kotovets said.
That handles the inbound problem: knowing which company an agent belongs to and what data it is entitled to query. But D&B also had to solve the outbound problem — what happens when a customer's own multi-agent workflow loses track of which company it is analyzing.
In a workflow that chains a credit check agent, a KYC agent and a third-party risk agent, each queries D&B at a different step. Without a mechanism to confirm they are all referencing the same entity, a workflow can complete while operating on divergent records.
"They have to come back to our verification agent to ensure that they're still talking to each other about the same entity," Kotovets said. "It's almost like a digital handshake, in a sense."
D&B's business verification agent can be embedded into any workflow as a persistent reference point and is available on Google's A2A protocol regardless of which orchestration tool a customer uses.
Four things enterprises keep getting wrong
The rebuild surfaced requirements that extend beyond D&B's own stack. Kotovets said he has spoken with hundreds of CDOs and CIOs over the past six months and consistently heard the same constraint: they could not build what they wanted in AI because their data foundations were not standardized, normalized or agent-queryable. The four lessons D&B drew from that pattern:
Data foundations come before agent infrastructure. Until records are clean, normalized and consolidated, agent infrastructure will surface the same problems at higher velocity.
Design for dynamic relationships, not static ones. Enterprise systems typically record point-in-time connections: a person belongs to a company, an asset belongs to a subsidiary. Agents working on credit, risk or supply chain decisions need to reason across relationships that shift over time. If the underlying data only captures the static line, the agent will too.
Build entity consistency checks into multi-agent workflows. When multiple agents touch the same entity at different steps, there is no guarantee they are all referencing the same record by the time the workflow completes. That gap needs to be engineered for explicitly — it is a workflow design requirement, not an optional guardrail.
Embed lineage from the start. Every agent-produced answer should carry a traceable path back to its source. In credit, risk and supply chain decisions, the cost of an error is concrete. Lineage needs to be built in before scaling.
Our read
D&B is not a sexy infrastructure story. It is a 180-year-old company that knows one thing better than almost anyone: who a business is, what it owns, and who runs it. The interesting part of this story is not that a legacy data vendor added an API. It is that D&B had to solve the agent identity problem — both inbound authentication and in-workflow entity verification — before its data could be genuinely useful to machines.
The Know Your Agent framing is the part worth watching. Every enterprise that puts agents into procurement, credit, or compliance workflows will hit the same wall: agents drift. They lose track of which entity they started analyzing, especially in long or branching workflows. D&B's verification agent is one attempt at solving that. Whether it becomes a pattern — or whether every major data provider ends up building something similar — is a question the enterprise AI stack has not answered yet.
The dependency graph matters here. You cannot build reliable agent workflows on unreliable data. You cannot build reliable data on fragmented architecture. And you cannot debug what you cannot trace. D&B's four enterprise lessons are obvious in hindsight, which means most organizations are not acting on them yet.