AI Agents Shop for You. Nobody Has to Say Which Model Is Working
Two protocols. Dozens of partners. Hundreds of billions in projected commerce. And nobody is required to tell you which AI model is doing your shopping.
That is the situation taking shape as agentic commerce — purchases made by AI agents acting on behalf of consumers — moves from experiment to infrastructure. ChatGPT's Instant Checkout went live in September 2025, serving 900 million weekly users, with merchants paying a 4 percent transaction fee on each sale. Google's Universal Commerce Protocol launched in January 2026 with more than 20 partners including Walmart, Target, Shopify, Visa, and Mastercard. McKinsey projects this channel will drive $3 to $5 trillion globally by 2030. The question of whether that materializes at scale is open. The question of what disclosure norms should govern it is not being asked publicly by any of the major players.
OpenAI and Anthropic have not committed to disclosing which model tier handles a given commerce transaction. ChatGPT's Instant Checkout uses whatever GPT variant is current; the product documentation does not specify a capability level. Google UCP works with individual merchants; the protocol has no model-disclosure requirement.
The disclosure gap has dollar consequences. A December 2025 experiment Anthropic published on its research blog provides the mechanism. In Project Deal, 69 Anthropic employees let AI agents negotiate on their behalf in a classified marketplace; the agents completed 186 deals worth just over $4,000. When Anthropic ran parallel versions using Claude Opus 4.5 versus the weaker Claude Haiku 4.5, it found that Opus-represented participants extracted $3.64 more per item on average. In one example, the same lab-grown ruby sold for $65 via Opus and $35 via Haiku. Participants with the weaker model did not notice the disadvantage — satisfaction scores were equivalent despite worse outcomes.
"The representation by a better model often did not lead people to perceive a better experience," Anthropic noted.
Project Deal is a small, self-selected sample — 69 colleagues, a Craigslist-style marketplace, items manually catalogued by the experimenters. It is not a representative consumer study. But the underlying dynamic it illustrates is not complicated: a more capable model extracts better terms, and you cannot feel the difference without disclosure. In a conventional purchase, comparison shopping is the buyer's job. In an agentic purchase, the buyer delegates that judgment — and the quality of the model making that call is invisible unless the provider publishes it.
The comparison to financial advisory disclosure is imperfect but structurally relevant. Human advisors must disclose conflicts of interest and fee structures. AI agents negotiating the best price on a product — or the highest price for your labor — are not currently required to tell you which model made the call. If the agentic commerce channel reaches meaningful scale before disclosure norms develop, the harm is not hypothetical.
What to watch next: which platform wins the merchant acquisition race first. OpenAI has the head start — ChatGPT Instant Checkout is live with 900 million weekly users and partners including Shopify, Etsy, and Instacart. Google UCP has the payment network depth — Visa, Mastercard, PayPal, and a dozen retail brands are on the coalition. If model-tier becomes the actual quality differentiator in agentic commerce, that race is not won by the best price or the best UX. It is won by whichever platform runs the most capable model, and the consumer cannot tell which one that is.