Why nobody at the company can explain the AI bill anymore

PREVIEWWhy nobody at the company can explain the AI bill anymore · MD

Picture a finance lead at a mid-sized software company opening last month's AI invoice. Six figures, broken into categories she did not sign a contract for, with line items labeled in vendor-specific units she cannot map to anything her team tracked. The vendor's account manager can explain the math. Her internal forecast cannot.

For decades, enterprise software bills were predictable: a seat, a subscription, a line item a CFO could sign off on in under a minute. That model is gone. Across the AI stack, from model APIs and GPU rental to hosted inference platforms and the new AI features bolted onto existing enterprise tools, vendors have shifted to usage-based pricing tied to tokens or compute-hours. The result, according to a survey of executives conducted by KPMG and reported by The Register, is that nearly one in three executives admit they cannot fully explain their own company's AI operating costs.

The KPMG finding lands at a moment when the pricing structure has changed faster than the finance function's ability to track it. Anthropic, OpenAI, and GitHub have all moved away from flat seat-based subscriptions toward bills that scale with how much a customer actually consumes. That sounds fair on its face: pay for what you use. But a per-token meter, or a per-GPU-hour meter, behaves nothing like a per-seat license. A single enterprise workflow can fan out into dozens of model calls, each consuming tokens at a rate that depends on prompt size, model choice, response length, and routing logic. Multiply that by a few hundred employees and a handful of automated agents, and the monthly bill becomes a function of inputs no procurement team was ever set up to forecast.

The difficulty compounds when companies stack multiple AI vendors, as most now do. CFO Dive, summarizing KPMG's analysis, points out that the cost picture gets harder as firms lean on coordinated agent workflows, where one user request can trigger a chain of model calls across systems. Token consumption in this regime is non-linear, which is finance-speak for "your forecast is wrong and we will tell you by how much at the end of the month."

The vendor response to this confusion has, so far, been modest. Pricing pages are public, and discount tiers exist, but the units themselves remain opaque to anyone outside the vendor's product team. Anthropic's pricing page lists token rates by model, but a finance lead comparing Claude to GPT to Gemini has to first map each vendor's notion of a "token" to a common denominator that does not really exist. GPU rental from the hyperscalers adds another layer, since compute-hours and inference-hours are billed separately and bundled differently depending on region and reservation. The result is the artifact the industry has started calling token-maxxing: users gaming the meter because the meter is the only feedback they trust.

That distrust is also visible inside the consulting industry that finance teams typically rely on. KPMG's October 2025 AI report was retracted after GPTZero flagged widespread citation errors, a reminder that even Big Four cost analyses on AI have been built on shaky inputs. The June 2026 KPMG Global AI Pulse survey that surfaces the "one in three" finding names cost visibility and accountability as a top blocker to AI value realization in enterprises, ahead of model quality and ahead of data readiness. The diagnosis is no longer that AI is too expensive. The diagnosis is that companies cannot see what they are paying for.

Finance and engineering teams are not waiting for vendors to fix this. The response architecture forming inside companies treats AI spend the way cloud spend was treated a decade ago. FinOps-style tagging is being bolted onto every model API call so that costs can be allocated back to the team or product feature that triggered them. Reserved-capacity and commitment discounts are being negotiated up front, the way AWS and Azure customers learned to do, in exchange for predictable rates. Internal rate cards are being drafted by finance and platform teams so that engineering has a shadow price to design against, even when the vendor's meter is opaque. Procurement teams are rewriting vendor evaluation scorecards to require consumption benchmarks and forecasting primitives before contracts are signed.

None of this makes the AI bill predictable tomorrow. What it does is make it legible, and finance leads who can read their own invoice can hold their vendors to account for the meter they are running on. The next phase of this story is whether the vendors, in turn, choose to make that meter easier to read, or whether the FinOps layer becomes the permanent interface between corporate AI spend and the underlying compute economy.

Why nobody at the company can explain the AI bill anymore — type0 | type0

Why nobody at the company can explain the AI bill anymore

Sources