AI Economics

AI Token Spend Is Becoming the New Cloud Bill

Token consumption is scaling like cloud infra did a few years ago — and missing context is a major reason why. The next phase of enterprise AI will be defined by how efficiently companies build, govern, and serve context to their models.

Cubyts · May 18, 2026 · 5 min read

AI Token Spend Is Becoming the New Cloud Bill

I was recently speaking with someone from a major cloud provider.

He mentioned that one enterprise customer was sitting on an unpaid AI bill approaching a quarter million dollars because the spend came as a surprise internally.

The root cause was rapid adoption of AI tools across teams (why? probably because of repeated prompting, retries, and fragmented workflows…).

"AI token consumption is starting to resemble cloud infra spend from a few years ago," he said. And scaling much faster than most enterprises anticipated.

Gergely Orosz recently wrote an excellent piece in The Pragmatic Engineer titled *"The Pulse: token spend breaks budgets – what next?"* discussing how AI token consumption is exploding across engineering organizations. Based on conversations across multiple companies, some organizations reportedly saw token usage grow nearly 10x within six months.

With AI adoption out of the way, the question now is: how do we control AI economics / tokenomics while still improving output quality?

The numbers are already loud

Salesforce CEO Marc Benioff recently stated that Salesforce expects to spend close to $300 million on Anthropic tokens in 2026.
Anthropic has reportedly estimated enterprise coding agent usage at $150–$250 per developer per month in some environments.
Research on agentic coding systems has shown that some workflows can consume up to 1000x more tokens than standard code assistance tasks — while higher token usage does not necessarily improve outcomes.

That last point is important. Because token spend is not only driven by AI adoption — it is increasingly driven by missing context.

Missing context is the hidden cost driver

An AI coding assistant without architectural awareness asks more questions.
A model without dependency understanding generates more retries.
An agent without org memory repeatedly rediscovers decisions that already exist somewhere inside the company.

What should have taken 2 prompts becomes 10.

When this compounds across hundreds or thousands of engineers, token consumption starts behaving like infrastructure spend at enterprise scale.

The industry is already moving toward context

Google's Gemini Enterprise is one such example. Google is positioning Gemini not just as a model, but as an enterprise assistant connected to organizational systems like Gmail, Docs, Drive, Sheets, Meet, and enterprise search. The idea is simple: if AI can access enterprise context directly, users spend less time repeatedly explaining information through prompts.

That direction makes a lot of sense. But it also raises another important question: will horizontal enterprise context alone be enough?

Every industry operates differently. A banking workflow carries different dependencies, compliance requirements, terminology, and operational expectations compared to healthcare, telecom, manufacturing, consulting, or software delivery.

Even inside the same enterprise, every team develops its own operational language, architectural standards, historical decisions, delivery patterns, and workflows over time.

Context is a living system

Not just connected documents. Not just enterprise search. But curated, evolving, domain-aware understanding that continuously adapts to organizational and industry dynamics.

And this is where human effort (perhaps by building and maintaining the context layer with help from more AI agents 🙂) still becomes critical.

Someone — or something — has to: - structure the context - maintain the relationships - continuously align it with changing systems, policies, architectures, standards, and workflows

Otherwise AI agents keep compensating through larger prompts, repeated retries, excessive retrieval, and increasingly expensive reasoning cycles.

The next phase of enterprise AI

The next phase of enterprise AI will not be defined only by model intelligence. It will be defined by how efficiently enterprises build, maintain, govern, and serve context to those models.

Back to all posts