The conversation about whether AI is too big to run keeps fixing on the wrong number. Everyone watches the price of a GPU-hour, which keeps falling, and concludes the only ceiling is how much compute you can afford to rent. That was the 2024 question. It has quietly been replaced by a harder one, and the harder one is not about chips at all. It is about whether the building you put the chips in can be plugged into anything.
Start with the demand, because the demand is genuinely staggering and worth stating plainly. The IEA puts total data-center electricity consumption above 1,000 terawatt-hours in 2026 — roughly the annual electricity use of Japan, the world's fourth-largest economy, now spent keeping racks warm. Global data-center electricity grew about 17% in 2025, and the AI-specific slice grew far faster than that. The IEA projects data-center demand more than doubles by 2030, with AI as the dominant driver — its share of data-center power rising from somewhere in the 5–15% range toward 35–50%.
Those are the figures everyone cites. Here is the one almost no one does.
The queue is the product now
In the United States, somewhere north of 2,300 gigawatts of generation and storage capacity is sitting in interconnection queues — more than the entire installed US power fleet — waiting for permission to connect to the grid. The median project now waits close to five years, and roughly four in five eventually withdraw, according to the work out of Lawrence Berkeley National Laboratory that tracks this. In the markets where the data centers actually want to be — Northern Virginia, Phoenix, Dallas — utilities are quoting interconnection waits of four to seven years. AI infrastructure that came online in 2025 had been in the pipeline, on average, for more than seven years.
You can rent a thousand GPUs this afternoon. You cannot rent the megawatt that powers them; you can only join a line for it, and the line is five years long.
So the asymmetry is total. The compute layer clears in minutes — spin up the instance, pay the bill, tear it down. The power layer underneath it clears in years, if it clears at all. US data-center power demand is projected to hit about 76 gigawatts in 2026, up from 50 in 2024, and the grid operators are openly forecasting peak loads they do not currently know how to serve: PJM, the largest US grid market, projects summer peak demand climbing from roughly 154 GW toward 210 GW over the next decade, driven mostly by data centers. The chips are not the bottleneck. The substation is.
Why this lands hardest on agents
A chatbot is a bursty load. Someone types, the GPU lights up for two seconds, it idles. You can serve a great deal of that demand from spare capacity, shifted around in time, smoothed across a day.
An agent is not bursty. An agent is the thing this publication keeps writing about — the operator running a loop at 4am, the scheduled job that wakes every ten minutes, the autonomous process that does not stop when the human goes to bed. Agents convert AI from a load you summon into a load that simply runs. That is a baseload appliance, and baseload is precisely what the grid cannot give you on short notice, because baseload is what the five-year queue is rationing.
Which produces the genuinely strange mismatch at the center of all this. The model your agent runs on has a useful commercial life measured in weeks before a point-release makes it the slow, expensive option — we watched four frontier models ship in a single month this spring. The power contract to run that agent at scale is measured in years before a utility will energize it. You are signing a multi-year lease on electricity to run software that will be obsolete before the transformer arrives. No one prices an asset that way on purpose. The industry is doing it by accident, because the two clocks were never meant to be read together.
What the number actually means
None of this says AI stops. It says the location of the constraint moved, and most of the strategy hasn't caught up to where it went. The companies that win the next phase will not be the ones with the cleverest model — models are converging and depreciating. They will be the ones who locked power years ago: behind-the-meter generation, gas turbines on site, the nuclear restarts and the data-center-next-to-the-plant deals that looked eccentric in 2024 and look like foresight now.
For everyone else, the binding question stops being how good is the model and becomes can you get the electrons, and when. The honest forecast for the agent era is not in any benchmark. It is in an interconnection queue, and the queue says: not for a while.



