The Agentic Token Tax

Goldman Sachs forecasts a compute explosion

The silent hum of the data center is getting louder. Goldman Sachs Research released a report on May 24 suggesting that token consumption will increase 24-fold by 2030. This is not a linear progression of current habits. It is a fundamental shift in how silicon processes logic. We are moving away from human-to-machine prompts. We are entering the era of machine-to-machine orchestration. The implications for capital expenditure are staggering. The energy grid is not ready. The semiconductor supply chain is already stretched thin. This 2,400 percent increase represents a total reimagining of the digital economy.

The mechanics of autonomous waste

Agentic AI functions differently than standard large language models. A chatbot waits for a user. An agent acts on a goal. This requires recursive loops. An agent might call an API, check the result, realize it failed, and try a different path. This is known as Chain of Thought (CoT) processing. Each step in that chain consumes tokens. According to recent Bloomberg market analysis, the cost of compute for these autonomous workflows is currently five times higher than simple generative tasks. The efficiency gains must be massive to justify the burn. If an agent takes 50 steps to book a flight, it has consumed 50 times the tokens of a single search query. This is the ‘Agentic Tax’ that enterprises are beginning to calculate.

Projected Global Token Consumption Growth (Index 2024=100)

The physical limits of digital growth

Silicon is the new oil. The demand for H200 and Blackwell-class chips is no longer about training. It is about inference. Goldman’s data points to a world where AI agents are running 24/7 in the background. They are managing supply chains. They are optimizing ad spend. They are writing code while the developers sleep. This creates a baseline load on data centers that mimics industrial manufacturing. We are seeing a pivot in how Reuters reports on energy infrastructure, with utility companies in northern Virginia and Dublin struggling to keep pace with the power draws required by these agentic clusters. The 24-fold increase in tokens translates directly to a massive increase in gigawatts.

Token density vs economic value

Not all tokens are created equal. The market is currently obsessed with volume. Smart money is looking at density. We are seeing a divergence between high-value reasoning tokens and low-value ‘slop’ tokens. The following table illustrates the projected shift in token utilization across different sectors as of May 2026.

SectorCurrent Token Usage (Rel.)Projected 2030 Usage (Rel.)Primary Driver
Financial Services1.0x18.5xAutomated Compliance Agents
Healthcare1.0x32.0xAutonomous Diagnostic Loops
Software Dev1.0x28.2xSelf-healing Codebases
Consumer Retail1.0x12.4xPersonal Shopping Concierges

The numbers are staggering. Healthcare leads the pack because the ‘reasoning’ required for medical diagnostics involves massive cross-referencing of patient data, research papers, and real-time vitals. Each diagnostic check is a multi-thousand token event. Goldman Sachs is betting that the enterprise will absorb these costs because the alternative is human labor that is slower and more prone to error. However, the margin compression for SaaS companies is real. They are paying for the compute while charging flat subscription fees. This model is broken. We expect a shift toward ‘pay-per-reasoning-step’ billing cycles by the end of the year.

The hardware bottleneck persists

Foundries cannot build fast enough. The 24-fold growth target assumes that the current pace of GPU deployment continues unabated. It ignores geopolitical friction. It ignores the scarcity of high-bandwidth memory (HBM). If the supply of specialized AI chips hits a plateau, the token explosion will be throttled. This would lead to a massive spike in token pricing. We are already seeing secondary markets for compute time emerging in the dark pools of the tech world. Large enterprises are hoarding compute capacity like a strategic reserve. They know that without tokens, their autonomous future is paralyzed.

The next data point to watch is the June 15th quarterly earnings from the major cloud providers. We need to see if their capital expenditure guidance matches the 24-fold trajectory predicted by Goldman Sachs. If the spending cools, the agentic dream cools with it. If the spending accelerates, the energy crisis becomes the primary narrative of the late 2020s.

Leave a Reply