AI Is Splitting in Two. The Cloud May Be on the Wrong Side.

AI Is Splitting in Two. The Cloud May Be on the Wrong Side. — type0 | type0

PREVIEWAI Is Splitting in Two. The Cloud May Be on the Wrong Side. · MD

A two-tier economy is taking shape inside artificial intelligence, and the cloud companies that spent tens of billions of dollars building the physical infrastructure to serve it are discovering what it feels like to operate a utility.

The industry is settling into a pattern that economists and analysts have started to describe as a power law: a small set of state-of-the-art models from a handful of frontier labs captures an outsized share of revenue and attention, while the long tail of inference, the work of actually running models in production, migrates to cheaper alternatives, often open-source. The top of the distribution is where the brand and the margins are. The bottom of the distribution is where most of the tokens are.

Brookings has framed the upper tier as the "invisible hand" of ChatGPT, warning that a handful of foundation model providers are positioned to set defaults, prices, and norms for everyone downstream. That is a real kind of market power, and it is the kind that investors have been paying premium multiples for. The complication is that the model layer and the infrastructure layer are not the same business, and they are not accruing the same kind of power.

This is where the cloud's AI bet gets uncomfortable. The model of running AI is, underneath the marketing, the model of moving electricity. Tokens are billed by the million, the math is dominated by memory bandwidth and chip depreciation, and the work is increasingly portable across hardware vendors, as The Register's recent walk through the economics of AI inference makes clear. There is a reason that Philipp Dubach's "AI Models Are the New Rebar" framing has circulated: rebar is a commodity input whose price you accept, not a platform whose price you set. The companies providing the underlying compute for AI inference are starting to look more like rebar suppliers than software vendors.

The specialty cloud sector, where independent GPU-heavy providers serve AI workloads directly, is where that dynamic is most visible. Sacra's profile of CoreWeave traces a company that rode the AI wave to a multi-billion-dollar valuation, then began confronting a familiar set of pressures: high depreciation on cutting-edge GPUs, customers negotiating harder on price as the frontier moves, and a competitive set that now includes the hyperscalers and a growing roster of open-source-friendly inference providers. A business that started as a specialized AI cloud now has to defend margins against commodity economics. That is not a posture a platform business wants to be in.

Interconnects' Nathan Lambert has argued that open and closed models are on meaningfully different trajectories, and that open models are improving on a steeper curve than the frontier labs' headline numbers suggest. If he is right, the economic gap between the top of the distribution and the long tail keeps narrowing, and the value that does accrue to the long tail keeps migrating to the open-source community rather than to whichever cloud happens to host it. A Letter a Day's structural case for open-source AI reaches a similar conclusion from a different angle: the cost of replicating a frontier-class model keeps falling, and the constraint on AI diffusion is shifting from model availability to deployment and integration, both of which are activities that don't necessarily enrich the GPU owner.

Put those threads together and a picture emerges. The cloud's AI infrastructure is, at the platform level, a real and necessary business, but the leverage in that business is migrating upward, into the model layer where brand, data, and distribution determine who captures value. The physical infrastructure is becoming the railroad track rather than the railroad. The companies that own the tracks will be paid, but the companies that own the schedules and the stations will be paid more.

This is the part of the AI story that is harder to see because the news cycle mostly runs on capacity announcements. A new data center, a new chip, a new round of GPU procurement: each event reads as evidence of platform-level strength. Read the same event through the power-law lens and it can read differently. Building more capacity for the long tail of inference is a bet on commodity volume, and commodity volume is a thin-margin business even when the underlying technology is impressive.

There is a falsification criterion worth keeping in mind. The commoditization story weakens if frontier models keep a durable scarcity premium, that is, if customers keep paying a large multiple for state-of-the-art performance on tasks where the open-source long tail has not closed the gap. It strengthens if open models continue to eat into the addressable share of inference, and if hyperscaler and specialty cloud pricing starts to converge on a small spread over their own cost of capital. Watch the gap between closed and open model performance on the workloads that actually pay the bills, not the leaderboard headlines.

The implication is not that the cloud's AI buildout was a mistake. Demand for inference is real and growing. The implication is that the buildout is one layer of the stack, and the layer that captures the platform-style returns may be the one that sits on top of it. The companies that read the cloud's AI bet as a platform bet may need to read it again as a utility bet, and price the equity accordingly.

AI Is Splitting in Two. The Cloud May Be on the Wrong Side.

Sources