The AI Token Budget Crisis Isn't Real. The Behavior Around It Is.

The AI Token Budget Crisis Isn't Real. The Behavior Around It Is. — type0 | type0

PREVIEWThe AI Token Budget Crisis Isn't Real. The Behavior Around It Is. · MD

For most of the last year, the dominant enterprise AI story was how much companies were spending. In 2026, the dominant story is what they are doing to control that spend, and the answer is not what the headlines have been promising.

An AI token is the small chunk of text that a large language model reads and writes. It is also the unit that vendors bill on, roughly a sliver of a cent per token for popular frontier models. For a year, big tech companies including Meta and Salesforce pushed employees to use AI as much as possible, a behavior the analyst community dubbed "tokenmaxxing." That phase produced viral stories about runaway costs and an internal Meta dashboard called "Claudeconomics" that ranked the company's top 250 employees by token consumption.

Now the money is being managed, and the new reporting from SemiAnalysis, based on more than 50 conversations with enterprise AI buyers at the Databricks AI Summit and over Slack and phone, suggests the management is producing something stranger than a clean pullback.

Budgets, when SemiAnalysis asked, ranged from roughly $250 a month at the low end to tens of thousands of dollars at the high end, with no consensus level among the buyers the firm spoke with. The familiar "we are blowing our AI budget" framing was not the median experience. Most of the 50+ conversations described costs that were real but controlled, and the few large overruns tended to cluster at specific companies with specific problems, not across the enterprise base.

Where the uniform story does exist is in behavior. To stretch budgets, companies are downgrading the default model that employees see in their AI tools, turning off premium tiers, and routing specific workflows to cheaper models after evaluation. The lever is the model picker, not the user count.

A second lever is cultural. The "tokenmaxxing" phase did not die so much as migrate. Meta's internal "Claudeconomics" leaderboard, which ranked employees by token consumption, has been widely discussed, and the gaming of M365 Copilot subscriptions, where employees manipulate usage metrics to preserve their token allowance inside a constrained per-seat budget, is the most uncomfortable of the new behaviors. SemiAnalysis calls these out as analyst-attributed observations, not as a corporate program, but the pattern recurred in enough customer conversations that the firm treats it as the dominant cultural fact of the 2026 rationalization phase.

The widely reported retrenchments at Meta and Uber, framed in the press as evidence of an industry-wide pullback, look in this light like specific organizational problems. Both companies ran incentive structures that rewarded maximal AI consumption, and when those structures were removed, consumption dropped. The behavior was not the budget; it was the bonus. At most of the other organizations SemiAnalysis spoke with, no such incentive existed, and there is no comparable retrenchment to report.

The pricing side reinforces the picture. The same week as SemiAnalysis's conversations were published, Anthropic released Claude Sonnet 5 at a lower per-token price point, a move that compresses the cost of running AI on a premium model and gives budget-constrained teams a way to extend their allowance without downgrading. The era of one big model at one price is ending. The era of many models at different prices, with humans in the middle arguing about which one to use, is the structure of 2026.

What to watch next is whether the second-order behavior stabilizes into a normal cost-control pattern or becomes the dominant feature of how AI gets used inside large companies. A leaderboard culture, a default-model downgrade, and a subscription-allocation game are not what a CFO thinks they are paying for when they approve an AI line item, and the gap between what AI budgets buy and what AI budgets produce is the question the next year of enterprise AI will resolve, one way or another.

The AI Token Budget Crisis Isn't Real. The Behavior Around It Is.

Sources