The arithmetic is blunt. U.S. data centers consumed roughly 224 terawatt-hours of electricity in 2025, more than 5% of the country's total, according to the International Energy Agency. That is up from an estimated 1.9% in 2018. In seven years, the slice of the U.S. power grid feeding the machines that answer ChatGPT queries, generate images, and train the next generation of models has nearly tripled. Growth, the IEA says, is set to soar: OpenAI disclosed in March 2026 that more than 900 million people use ChatGPT every week, on top of an earlier TechCrunch estimate of roughly 2.5 billion prompts per day.
Each individual query is small. Google estimates a median Gemini text prompt consumes about 0.24 watt-hours of electricity, roughly the energy a 60-watt bulb draws in 14 milliseconds, or about 9 seconds of a modern flat-screen TV. Google says that is 33 times more efficient than its 2024 baseline. Multiply the per-query figure by 900 million weekly users and the scale is no longer abstract.
The structural problem is older than AI. Efficiency gains historically trigger what economists call the Jevons effect: cheaper coal kept getting used in more factories, not fewer. Cheaper fuel kept getting burned in bigger engines. The same pattern is already visible in AI's energy curve. As Katarina Zimmer reports in Knowable Magazine (republished by Singularity Hub), training a model like GPT-4 consumed an estimated 50 to 60 gigawatt-hours, enough to power San Francisco for three to four days. The bigger cost is everything that comes after. "You train once, then you infer for a billion people," University of Michigan computer scientist Mosharaf Chowdhury told Knowable. The ml.energy leaderboard he runs tracks exactly this.
The room for cuts is real, but it is uneven. Researchers, vendors, and the press tend to group the proposals into four pillars: algorithms, hardware, computing methods, and siting or green energy. The question is which ones have actual leverage, the scale, timeline, and political-economic alignment to bend the curve, and which are being oversold.
Algorithms are the cleanest win on paper. A UNESCO report and a study by University College London's Ivana Drobnjak compared Meta's Llama 3.1 against small task-specialized models like DistilBART for summarization or t5-small-XSum. The small models used more than 90% less energy to do the same job. Researchers at Johannes Kepler University Linz, including Günter Klambauer, have shown that an extended LSTM architecture called xLSTM uses about 50% less energy than transformer models on texts around 8,000 words long. The catch is that these gains are measured on specific tasks, and the larger the model, the more demand grows to use it. The same dynamic is visible in DeepSeek's R1, which was marketed as a much lower-energy competitor to U.S. frontier models. MIT Technology Review reported in January 2025 that independent experts have raised doubts about the energy figures, in part because the model routes queries to the most expensive parts of itself more often than its marketing suggested.
Hardware is the lever with the most vendor control and the most inertia. Custom accelerators from Google, Amazon, Meta, and others have driven steady per-query efficiency gains, and the Hugging Face AI Energy Score leaderboard shows that for the same model, different hardware configurations can cut energy use by factors of three to ten. But new AI-focused hyperscale campuses are typically a gigawatt or more, roughly a tenth of Los Angeles' electrical capacity, according to capex announcements from Google, Meta, Amazon, OpenAI, Anthropic, Microsoft, and Oracle reported by Reuters and The Wall Street Journal in late 2025. The chips are getting more efficient. The campuses are getting bigger. Net effect so far: more.
Computing methods, the way work is split, routed, and reused, are the lever with the most leverage per dollar and the least public visibility. Inference, not training, is the dominant cost for any deployed model, as Chowdhury's quote captures. Mixture-of-experts architectures, used in Gemini and ChatGPT, activate only a subset of parameters per query rather than running the whole model, and sparse-attention techniques reduce the quadratic compute cost of long-context transformers first described in Vaswani et al.'s 2017 paper. Cornell Tech's Udit Gupta and Cornell's Fengqi You have documented gains from caching, batching, and routing. None of this is reported publicly at the data-center level, which is the central transparency problem.
Siting and green energy are the lever with the largest absolute potential and the most political friction. New AI data-center campuses are landing in places where the grid is already stressed, and gas-powered plants have filled much of the new demand, as UC Santa Barbara's Eric Masanet has documented and as Zimmer's reporting confirms. Power purchase agreements, the contracts tech companies sign to claim clean energy for their load, do not by themselves cut net emissions unless new clean generation exceeds new demand. They rarely do. A 2025 Nature study projected that absent efficiency gains, U.S. data centers could release carbon dioxide equivalent to 24 to 44 megatons annually, the upper end comparable to Norway's yearly emissions. The community costs sit on top of the carbon: air and noise pollution, water stress for cooling, and the lifecycle emissions of building the hardware itself.
The transparency gap is the reason none of this can be ranked with confidence. Tech companies do not publicly report data-center electricity use. The IEA's 224-terawatt-hour figure is an estimate built from public filings, market reports, and inference about undisclosed load. Hugging Face's leaderboard and the ml.energy project are exceptions built by academics; they are not the rule. Mandatory reporting would not by itself bend the curve, but without it the curve cannot be measured, and what cannot be measured cannot be regulated.
So which lever is being pulled hardest, and which is being left unpulled because it does not fit any single company's business model? Algorithms and hardware are getting pulled hard: the work is concrete, the gains compound, and the press releases write themselves. Computing methods are getting pulled where vendors see a competitive edge, and buried where they do not. Siting and grid integration, the lever with the largest scale and the longest timeline, is being left largely to local utilities and to community groups organizing against new gas plants and new water draws. That mismatch, between where the leverage is and where the action is, is the story the 5% figure is really telling.