The hottest prediction in enterprise software a year ago was that AI agents would make SaaS (Software as a Service, the subscription model most enterprise tools run on) obsolete. Box CEO Aaron Levie says that reading is exactly backwards. Agents, he argues on Podcast Alpha, are about to become the largest users of enterprise software, generating work at roughly 100 times the scale of human employees, and the consequences fall along the lines of the software stack rather than the model labs above it.
This is more than a CEO reframe. It is a structural inversion of who the customer actually is. When the customer is a language-model-driven agent that reviews contracts, processes claims, or compiles compliance reports, the unit economics of software change. A salesperson logs into Salesforce and creates a few dozen records a day. An agent can create a few thousand. The infrastructure built for humans still works for agents, and it scales linearly with token volume (the count of text fragments a model processes) in a way that per-seat licensing never did.
The split that matters
Most of the public conversation about agents has been about coding: Cursor, Devin, GitHub Copilot, Claude Code. Levie thinks that is the wrong slice. Coding, he estimates on the same episode, will be only 5 to 10 percent of agent-driven work. The other 90 to 95 percent is the long tail of knowledge work that has resisted automation for decades: contract review at law firms, claims processing at insurers, account reconciliation at finance teams, patient intake at hospitals. The shared substrate is unstructured data, the PDFs, emails, scans, spreadsheets, and recordings that do not sit neatly in a database row.
That is the file-system problem. Agents cannot operate on a CRM (customer relationship management platform, Salesforce's core product category) the way humans do, because most of what an enterprise actually knows lives outside structured databases. They need a content layer that can read, write, and route unstructured documents the way a human file clerk would. Box has been building that layer for two decades, and Levie's pitch on the earlier Podcast Alpha installment is that agents are now arriving as the third constituent of his platform, alongside humans and applications.
Three-layer stack, three margin paths
Levie organizes the AI economy into three layers moving on independent timelines. The bottom layer is chips and infrastructure: Nvidia, the hyperscalers, and the power grid behind them. The middle layer is the software stack: Box, Atlassian, Salesforce, ServiceNow, Workday. The top layer is the model labs: OpenAI, Anthropic, Google. As model costs fall, he expects margin to migrate upward from chips and downward from models, settling in the applied software layer where companies own the workflow, the data, and the customer relationship.
This is the contrarian move. The dominant media frame treats AI as a model race, with value accruing to the labs and the hyperscalers. Levie's frame inverts that: value accrues to the layer that does the actual work, and the work in most enterprises is still being done by software that was built long before large language models existed. His own numbers line up with it. Box's Q1 fiscal 2027 earnings call, reported in late May 2026, describes revenue growth accelerating rather than decelerating through the period when agent adoption among enterprise customers became measurable. The Q4 fiscal 2026 transcript from March 2026 shows the same direction of travel, with management pointing to AI-driven workloads as a tailwind. Levie reads Atlassian's recent earnings the same way, pointing to a cohort of customers that appears to be the most agent-aggressive as the fastest-growing part of the platform — though the primary Atlassian earnings transcript was not independently accessed to corroborate Levie's reading.
Why one model is not enough
The second leg of the argument is structural. Levie expects enterprises to refuse single-vendor model lock-in, and he gives three reasons. First, cost volatility: a cost gap that Levie, citing Coinbase CEO Brian Armstrong's math, estimates at roughly 100x between a frontier model (the most capable and most expensive AI in a given generation) and a near-frontier open-source model for comparable tasks. Even at a 10x gap, the routing incentive is severe. Second, performance differences: different models are tuned for different tasks, and the spread between a strong and weak model on a specific workflow is wide enough to matter. Third, geopolitical access risk: governments are starting to restrict access to specific models, which means vendor concentration in any one provider is also a regulatory concentration.
The conclusion is that model routing, switching between AI models on a per-task basis to balance cost, performance, and compliance, becomes a hard requirement rather than an optimization. That requirement is itself a feature of the applied software layer. The model labs do not get to bundle it, because bundling is what the customer is trying to avoid.
Friction worth naming
Levie's argument does not stand without friction, and he does not pretend it does. Uber has been cutting token usage, which reads at first glance like a reversal of agent demand. Levie frames this as budget math rather than a demand reversal: when a model costs 100 times a competitor, you optimize usage until the cost gap closes, then expand again. He invokes the idea that cheaper software pulls in more demand rather than less, the economic pattern usually traced to the 19th-century coal debate, and points to continued aggressive hiring at Anthropic and OpenAI as supporting evidence.
The harder friction is that the 100x figure and the 90-95 percent knowledge-work split are Levie's own estimates, from a CEO who stands to benefit if his framing wins. The numbers are reasonable as forecasts, but they are forecasts rather than measurements. Box's earnings transcripts show the direction of travel but not yet the magnitude. The thesis is falsifiable: if SaaS revenue across the applied layer starts contracting while agent-driven usage rises, the inversion weakens.
What to watch
Three signals will tell us whether Levie's frame holds. First, whether the major enterprise software vendors, Salesforce, ServiceNow, Workday, Atlassian, start disclosing agent-driven workload as a separate line item in their earnings. Second, whether per-token pricing compression continues long enough that agent workflows become the default at the knowledge-work tasks he names. Third, whether the labs build horizontal workflow layers, in which case the applied-software argument starts to fray, or whether they stay vertically focused on model quality and let the routing layer live elsewhere.
The question is no longer whether agents will use enterprise software. That is already happening in the transcripts. The question is which layer of the AI economy will own the margin when they do.