The Token Bill Is Coming: Industry Insiders Struggle to Manage Artificial Intelligence’s Costs

Across the industry, companies are beginning to question the price of artificial intelligence. Uber blew through its entire 2026 AI coding budget by April. Microsoft recalled Claude Code grants its developers licenses months after activation. A Priceline employee told TechCrunch that a regular runner contract renewal came back 4-5 times more expensive.

Although prices per token have decreased, the push for more AI adoption and increasingly autonomous agents have driven token consumption ever higher. Companies that sold out in early 2025 with all-you-can-eat subscriptions are now trying to figure out where their money is going, pull back their spending, and figure out if they can salvage some ROI from the wreckage of their budgets.

Meanwhile, a market is forming to meet them there. Startups, established vendors and a new standards organization are fighting to give companies the tools and language to track what they spend.

“Six months ago, I would have a conversation with a customer and it would be all about, ‘What can he do?’ Is it good enough?'” OpenAI chief operating officer Alexander Embiricos told TechCrunch at an event in New York this week. “Our conversations are never about that now. Now the conversations are about, “hey, we’re spending so much. What visibility do you have? What control do you have? What subtle controls do you have? What’s the effectiveness of your models?”

It’s against this backdrop that the Linux Foundation this week unveiled plans for the Tokenomics Foundation, a new standards body that aims to instill the same cost discipline around AI tokens that FinOps did for cloud spending.

“In April and May, I started hearing from companies, ‘Oh my gosh, we’re over 3x our token budget for 2026 and it’s only April,'” JR Storment, executive director of the FinOps Foundation, a project within the Linux Foundation, told TechCrunch. “We started hearing existential crises and the whole conversation shifted from tokenmaxxing and ‘go fast’ to ‘we need guardrails, how can we control this?’

The cries heard across the tech world followed fervent demands from CEOs pushing their teams to use the best models and move fast, cost be damned. New models released in November, such as Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, brought major improvements to dealer tools, which have multiplied consumption. So a company according to information found himself with a $500 million Claude bill after forgetting to set usage limits for employees.

“It’s like the crack cocaine epidemic,” said Chris Reed, senior director of IT finance at Priceline, noting that the company had begun setting limits on certain groups. “They let you try it to get you hooked on it, and now you’re beholden to it.”

Vitaly Gordon, CEO of engineering operations platform Faros AI, said he recently spoke with a CTO who told him, “One of my engineers spent $40,000 on tokens last month, and I really don’t know if I should stop him or go and tell everyone to be like him.”

A March overview by Faros found that among 20,000 developers, production increased, but so did bugs and rewrites. Jellyfish, an engineering management platform, similarly found that engineers who used the most tokens were about twice as productive as those who used less AI, but spent 10 times the number of tokens to get there.

Nicholas Arcolano, head of research at Jellyfish, told TechCrunch via email that AI spending is soaring largely due to agents, with spending per developer growing roughly 18.6x in nine months. Taken together, these statistics make the productivity case murkier than spending suggests.

“Whether extreme spending pays off comes down to the ultimate business value of the code sent (ie, revenue), which most companies still can’t measure,” said Arcolano.

At least part of this measurement issue is the sheer scale at which AI is being used today.

“Tracking cloud costs is a data problem of hundreds of millions of rows per month,” said Storment. “Tracking token costs is a trillion-row-a-month data problem. You can’t just stick it into any spreadsheet or even a basic tool. You have to fundamentally rethink your tools, your specifications, and your accounting systems to do that.”

At Priceline, Reed is already seeing discrepancies. He noted issues between a vendor’s reported usage and Priceline’s internal data.

“I started my career in telecom spend management and I see all the same parallels, from telecom to cloud to artificial intelligence,” he said. “Every time you introduce something new, it’s ready for billing errors and opportunities to audit and optimize.”

A market is starting to form around this problem. There are pure-play companies like Pay-i that track, measure and optimize the cost and return on GenAI investments. Payment, meanwhile, allows developers to track costs, measure usage, and charge users based on actual value rather than subscription fees.

Then there are companies like Jellyfish, Waydev, and Faros AI, all of which provide AI agent tracking to prove the ROI of developer tools. Storment says most of the FinOps Foundation’s 180 vendors lean toward this space.

Companies with existing distribution are also adding new features to take advantage of this new market. The ramp has recently moved AI spend management; Datadog and New Relic have dealt with services such as cloud cost management, token-level observability, and GPU monitoring. At the FinOps X conference next week, AWS is expected to introduce new financial management features geared toward enterprise AI spending.

Tiffany Luck, a fellow at NEA, believes efficiency and observability will likely be added to the “beam or application layer.” He pointed to Factory, a startup that builds AI agents for businesses, which this week launched a router model that automatically selects the right model for each job.

Gordon expects frontier labs and other model providers to adopt OpenRouter-style optimization to drive queries to the cheapest models—a trend already seen in Claude’s enterprise accounts.

“The financial report of how much you spend on Anthropic, even if you call the Opus model, some of the spend is going to be on Sonnet or Haiku because they’re smart enough to do that,” Gordon said. “I think that’s going to become more and more of an issue.”

But all of these tools are created without a common language or common definitions of how much a token costs, what it produces, and how to compare spending across vendors. That’s where the Tokenomics Foundation hopes to prove useful.

The Foundation is building a canonical definition and framework for “tokenomics”. open standards, specifications and metrics for AI token usage and billing. as well as new metrics for AI economics, such as cost per intelligence or chips per watt. It also plans to establish metrics for plant efficiency and consumption efficiency. The team is planning an official launch in July and is set to announce more members at the FinOps X conference next week.

“Token Economics is fundamentally more abstract and opaque than anything we’ve done at this scale before,” said Nishant Gupta, head of availability at Salesforce. “It requires a different operational muscle than the industry built for the cloud.”

That said, Goldman Sachs projects Global token usage will multiply 24x by 2030. Companies already over budget need solutions now, and the foundation’s first deliverable is still months away.

“We may have created a steam engine, but we still haven’t figured out the assembly line,” Gordon said.

According to Arcolano, the smart move is broad, modest adoption.

“The best return on investment comes from moving broad medium usage from low to moderate usage, without pushing heavy users higher,” he said.

Russell Brandom and Tim Fernholz contributed to this report.

When you purchase through links in our articles, we may earn a small commission. This does not affect our editorial independence.

What's Hot

The Token Bill Is Coming: Industry Insiders Struggle to Manage Artificial Intelligence’s Costs

Related Posts

Leave A Reply Cancel Reply