How Kensho built a multi-agent framework with LangGraph to solve trusted financial data retrieval
Querying financial data across domains is a genuinely hard problem. An analyst asking about the relationship between a company's debt load and its ESG rating is asking two different agents to have a conversation — and most AI systems still can't do that reliably. A team at Kensho, S&P Global's data and AI division, published their answer to this on the LangChain blog today: a multi-agent architecture called Grounding, built on LangGraph, that routes natural language queries to specialized Data Retrieval Agents and normalizes the results through a custom protocol designed to standardize data formats across domains.
The architecture is straightforward at a high level: a central router takes in the user's query, breaks it into domain-specific sub-queries, sends each to the appropriate Data Retrieval Agent, and reassembles the response. What makes it interesting is where the complexity lives. Rather than baking natural language parsing into every agent, the router handles that once — directing queries to the right DRAs for equity research, fixed income, macroeconomics, or whatever domain the question touches. Each DRA is owned by the team responsible for that data domain. The aggregation layer is where it gets nontrivial: responses come back in different formats, structured tables alongside unstructured text, and the system has to normalize all of it before presenting it to the user.
The DRA protocol is where I'd say the real architectural work happened. Ilya Yudkovich and Nick Roshdieh at Kensho wrote the post. Their key insight wasn't the routing — that's a known pattern. It was standardizing how agents format and exchange results. The protocol established a common data format for both structured and unstructured responses, handling the normalization problem that would otherwise live in every downstream consumer. That's the kind of plumbing that sounds boring until you try to build it without it.
The post is notable for what it doesn't claim. There are no benchmark numbers, no latency figures, no head-to-head comparisons. What it does have is three architectural lessons the Kensho team said they learned the hard way. First: observability is not optional in a multi-agent system — comprehensive tracing and deliberate metadata requirements are essential. Second: the financial industry requires high trust and certainty, which means evaluation can't be a single-step pass/fail. They evaluate using exact-match metrics (correct agents, expected responses) and tool-calling metrics (correct tools, various responses). Third: they continuously analyze user and agent interaction patterns and optimize the protocol based on what they find. The last one is the most interesting because it's the one most teams skip — it's infrastructure work that doesn't ship a feature. This story rests heavily on Kensho's own description of its architecture; independent attestation of the broader architectural claims is thin, even if the production deployment carries real weight.
Grounding now backs at least two products Kensho has deployed: an equity research assistant that helps analysts compare sector performance, and an ESG compliance agent that tracks sustainability metrics. Both share the same data access layer, which is the point — the agents are ephemeral, the protocol is the durable part.
The architecture has a commercial home now. The S&P Global Data Retrieval Agent, developed by Kensho, is available on Google Cloud Marketplace and has been validated on Gemini Enterprise. That validation landed in December 2025 when S&P Global announced a multi-year strategic partnership with Google Cloud to unify its proprietary data on BigQuery and expand agentic offerings on Gemini Enterprise. Martina Cheung, president and CEO of S&P Global, framed it as a milestone in the company's data and AI journey. Thomas Kurian, CEO of Google Cloud, called it a demonstration of advanced AI and data distribution at enterprise scale. The Data Retrieval Agent is the concrete product that came out of that — a tool that connects to S&P Capital IQ Financials, Machine Readable Transcripts, and Regulatory Filings, and returns results backed by citations to the underlying sources.
Kensho's agent skills are built on open standards — specifically MCP, the Model Context Protocol — and are designed to work across AI platforms and agent frameworks. That's notable. A proprietary multi-agent protocol would be a dead end; an MCP-backed one can plug into a broader ecosystem. Whether that ecosystem emerges is another question, but the bet is reasonable.
The financial services use case is instructive precisely because the stakes are high and the data is messy. A previous Kensho blog post on querying S&P Global's tabular data noted that off-the-shelf LLMs couldn't answer questions reliably when they required queries spanning more than three normalized tables. That's a concrete failure mode, not a benchmark complaint. Multi-agent routing is one approach to solving it; Kensho's is to build a protocol that makes the routing deterministic and auditable.
Whether this pattern scales beyond financial services is an open question. The 451 Research Voice of the Enterprise AI & Machine Learning study found that 58 percent of organizations are actively seeking opportunities to implement agent capabilities — a number that reflects ambition more than deployment, given how early most production agentic systems still are. The router-to-DRA pattern requires infrastructure investment most companies aren't making yet. But for anyone building agent systems that touch multiple data domains, the protocol layer is worth thinking about early. Routing queries is the easy part. Defining what a result looks like when three different agents contributed to it — that's where the actual engineering is.
The Kensho post is worth reading for the three architectural lessons alone. The production deployment on Google Cloud Marketplace gives it weight a toy example doesn't have. And the protocol-first design is the right call, even if it makes the post less exciting to read than a benchmark announcement. Good infrastructure usually does.