The Invisible Dollar Cost of Running a Weaker AI Agent
Anthropic ran 69 of its own employees through a live Claude agent marketplace for a week last December. When the smarter Opus model handled a sale instead of the weaker Haiku, the identical item went for $2.68 more on average — $30 more on one lab-grown ruby. Across 186 deals and just over $4,000 in transaction value, that gap held. The catch: the humans using Haiku agents couldn't tell which model their transaction ran on, and they rated the fairness of their own deals the same as Opus users. Forty-six percent of participants told Anthropic in a post-experiment survey they would pay for a service like this.
The number nobody is measuring is the gap between what a Haiku agent closes and what an Opus agent would have. When a request routes to Haiku instead of Opus to save on inference costs, nobody measures what that routing decision costs on the other side of the transaction. The agent does the deal. The human takes the outcome. The difference is invisible until someone builds the instrumentation to see it.
A Stanford and MIT team published findings last year showing weaker seller agents lost up to 14 percent in profit compared to negotiations between AI agents of equal capability. Anthropic's internal experiment is the first live-market data that maps that academic finding onto real people, real money, and a real marketplace. The sample is small and the commercial stakes are near zero. But the pattern is consistent: model tier is not just a performance spec. It is a line item on every transaction.
Anthropic has not said whether it plans to open Project Deal as infrastructure or keep it as a research artifact. What the data shows is that the routing decision — Haiku or Opus, cheap or capable — carries a cost nobody is yet accounting for.