Pharma companies are using AI agents to compress drug development timelines. The problem is, the thing that kills drug programs is not slow paperwork.
BCG published research showing that generative AI can cut early-stage drug discovery by 25 percent or more and dramatically speed up clinical trial protocol drafting. One multi-agent system BCG built for a global pharma client reduced the time to produce a clinical protocol from six months to a fraction of that, according to its published case study. The $22 billion healthcare AI market projected by 2027 suggests this is not a fringe experiment.
But the math on drug development is unforgiving. Developing and commercializing a new therapy can take up to 15 years and cost billions, according to BCG. The fundamental reason it costs so much: approximately 90 percent of clinical trial candidates fail, with roughly 70 percent of Phase II failures attributable to lack of efficacy, according to DrugPatentWatch. The average cost per approved drug, driven almost entirely by these failures, sits around $2.6 billion.
BCG's numbers are real. What they measure is execution speed. What kills drug programs is a different problem: the hypothesis-testing problem. Did we pick the right target? Does the mechanism actually work in humans? These are questions that faster documentation does not answer.
The distinction matters because the failure mode in drug development is not process inefficiency. If you have a candidate likely to fail Phase II because the biology does not hold up, producing that protocol faster just means you fail faster. The $2.6 billion average is lost to failure, not slowness.
This is the gap in the current agentic AI in biopharma story. The technology is real and the infrastructure being built is genuine. But the ROI narrative being drawn from faster timelines is measuring the wrong thing. BCG's own data shows only about 25 percent of pharma companies report that AI accounted for cost reductions and revenue increases of at least 5 percent. That is not an indictment of the technology. It is a signal that the easy wins are in documentation and execution, not in the upstream decisions that determine whether a program survives.
A September 2025 McKinsey analysis found that 75 to 85 percent of pharma workflows contain tasks that could be enhanced or automated by agents, potentially freeing 25 to 40 percent of employee time. The authors, Dan Tinkoff, Delphine Zurkiya, Eoin Leydon, and Jeffrey Lewis, noted this is the near-term opportunity: automating what already exists rather than solving what has not yet been solved.
A recent analysis in PubMed Central notes that agentic systems are poised to redefine core industry processes but that proof-of-concept and target validation remain largely unaddressed. The author describes a scenario where one AI agent generates a clinical trial protocol while another simulates cost scenarios. Both genuinely useful. Neither addresses whether the drug works.
The governance layer in multi-agent systems is where the real infrastructure story is. Managing the complexity of modern clinical trials requires systems that can track dependencies, enforce consistency, and maintain audit trails. BCG's system integrates internal content management platforms, regulatory systems, and external scientific databases under governance controls. The senior medical writers BCG worked with signed off on the output, judging that scientific rigor was maintained.
That kind of infrastructure is genuinely useful. It is also the kind of thing that makes the easy wins visible and the hard problems easier to ignore.
The second-order effect is where the misread becomes costly. A pharma executive reading 25 percent faster timelines might conclude pipeline productivity is about to improve significantly. A VC evaluating a drug discovery startup might read the same number as evidence that the company has cracked a key bottleneck. Both are wrong about what is actually being accelerated. The technology compresses the execution phase. The hypothesis-testing bottleneck stays where it is.
What multi-agent infrastructure in biopharma actually changes is the coordinate plane of trial complexity. More programs can run simultaneously. More variables can be tracked. The governance layer enables scale in a domain where manual coordination has been the constraint. That is real value. It is just not the value the headline numbers promise.
The next thing to watch is whether the focus on documentation efficiency expands upstream, whether agentic systems begin to touch target selection, mechanism validation, or the patient selection logic that determines whether a Phase II failure is a biology problem or a trial design problem. That is where the hypothesis-testing layer lives. Right now, it is untouched.