The Token Economy: When Intelligence Becomes a Line Item
The unit of work in software was the instruction for sixty years. Now it's the token. That changes who gets hired and who gets fired.
The unit of work in software was the instruction for sixty years. Now it's the token. That changes who gets hired and who gets fired.
For sixty years, the fundamental unit of work in software was the instruction. Deterministic. Predictable. A human writes code, a machine executes it, and the value is denominated in how cleverly the human can sequence those instructions. The developer's entire job was translation: turn business logic into machine logic, one function at a time, one Jira ticket at a time.
That era is ending. Not gradually, not with a polite sunset period. It's ending the way these things always end — with most people still optimising for the old model while a minority quietly builds the new one.
The unit of work is now the token. And a token is not an instruction. It's a unit of purchased intelligence.
The distinction matters more than it appears on the surface. In the instruction paradigm, you tell the machine what to do step by step. In the token paradigm, you describe what you want, feed it context, and purchase enough intelligence to get a result. The machine figures out the workflow steps on its own. The human's job shifts from writing logic to specifying outcomes and managing the intelligence budget that produces those outcomes.
This is not a tools upgrade. This is not “better autocomplete” or “AI-assisted development.” It is a categorical change in what computing is.
The numbers make the shift concrete. Andreessen Horowitz's latest enterprise AI survey found that average enterprise LLM spend has risen from roughly $4.5 million to $7 million over two years, with projections pushing past $11 million in 2026. The share of organisations planning to spend over $100,000 monthly on AI has more than doubled. Innovation budgets — the “let's experiment” line item — have collapsed from 25% of LLM spending to just 7%. The language has shifted from exploration to infrastructure.
When the average organisation moves AI from the “discretionary” column to the “essential infrastructure” column, you are watching a paradigm crystallise in real time.
Per-token inference costs have been falling at rates that make Moore's Law look gentle. Somewhere between 10x and 200x per year, depending on the benchmark and how generously you measure. GPT-4 equivalent performance cost $20 per million tokens in late 2022. Today it's pennies. Claude Sonnet runs at $3 per million input tokens. Fast forward a year and the current frontier models will be in the cents.
But here's the wrinkle most people miss: when a resource gets cheaper, you don't use less of it. You use an enormous amount more.
This is Jevons' paradox — the well-documented observation that efficiency gains in resource use lead to increased total consumption, not decreased. Steam engines got more efficient; coal consumption exploded. Cloud computing got cheaper; AWS bills went up. Satya Nadella invoked the paradox by name in early 2025 after the DeepSeek moment, saying that “as AI gets more efficient and accessible, we will see its use skyrocket.” He was defending Microsoft's infrastructure spending, but he could have been speaking for every hyperscaler pouring capital into AI compute.
The average organisation now spends $85,000 a month on AI, up 36% year-over-year according to Deloitte. And the share planning to push past six figures monthly has doubled. The token economy is not coming. It arrived while most executives were still debating whether to form an AI committee.
If you want to understand why token management is now a core business competency, look at what happens when you get it wrong.
Cursor — the AI coding editor that hit billion-dollar revenue at startling speed — found itself in a structural trap. It sends essentially all of its revenue to Anthropic in API costs. When Anthropic introduced priority service tiers and repriced caching, Cursor's AWS costs more than doubled from $6.2 million to $12.6 million in a single month, as reported by Ed Zitron from leaked billing data. The company had to gut its unlimited $20/month plan and introduce a $200/month tier. Users revolted. The subreddit became a complaint forum.
In the same reporting, Zitron disclosed that Anthropic itself had spent $2.66 billion on AWS through September 2025 — against an estimated $2.55 billion in cumulative revenue. More than 100% of topline went to compute before even accounting for Google Cloud spend. Perplexity, separately, was burning 164% of its entire 2024 revenue across AWS, Anthropic, and OpenAI combined.
These are not companies being reckless with money. They are companies operating in a fundamentally different paradigm where intelligence is a purchasable input with its own price curve and consumption curve. The ones that master token economics will compound. The ones that don't are one supplier pricing change away from crisis. There is a reason Cursor's response included building their own model — they needed to escape the dependency.
In the old paradigm, the scarce resource was developer time. You hired engineers, gave them tools, and the constraint on output was how many hours of skilled labour you could deploy. The management challenge was headcount planning, recruiting, retention — all the machinery of human capital management.
In the token paradigm, the bottleneck has moved. Raw intelligence is abundant and getting cheaper by the quarter. What's scarce is the ability to convert tokens into usable economic value. Knowing how to aim intelligence. How to structure context. How to route tasks to the right model at the right cost. How to build agent loops that sustain quality over time. How to measure whether the intelligence you're purchasing is actually producing the outcomes you need.
This creates an entirely new organisational capability. Call it token management, intelligence operations, context engineering — the name doesn't matter yet. What matters is that it's a real skill, it's measurable, and the organisations that build it are pulling away from everyone else.
The enterprises that have figured this out are building internal platforms that route work to the right model at the right price point. Haiku for the cheap stuff, Opus for the hard stuff, Sonnet for the middle. They negotiate custom API agreements with hyperscalers, commit to consumption floors in exchange for dedicated capacity and volume pricing. They are treating token spend not as a cost to minimise but as a lever to maximise return on intelligence.
This is where the conversation gets personal for anyone who writes software for a living, and where most of the current discourse is getting it catastrophically wrong.
The standard narrative has been binary: either AI replaces the developer or it doesn't. That framing is useless. What's actually happening is that the developer role is differentiating rapidly into at least three distinct tracks, each with different skill requirements, different compensation dynamics, and very different futures.
This developer doesn't write code. They specify outcomes and manage the intelligence that produces those outcomes. Core skills: system design, specification writing, quality evaluation, and token economics. They think in terms of agent architectures, context windows, eval frameworks, cost per outcome. They're effectively factory managers overseeing intelligence production lines.
Their value scales with the volume of intelligence they can direct. Compensation will increasingly correlate with token budgets rather than lines of code. This track favours people who are brilliant at decomposing problems, who can write precise specifications, and who can evaluate output quality ruthlessly.
Someone has to build the infrastructure the orchestrators use. The agent frameworks, evaluation pipelines, context management systems, routing layers that send the right task to the right model at the right cost. This is deep technical work — closer to traditional systems engineering than application development, but with an entirely new stack.
These developers need to understand model behaviour at a mechanical level. How context windows affect output quality. How different architectures handle different task types. How to build reliable systems on top of probabilistic components. This track is smaller in volume, more specialised, but the compensation ceiling is very high because the impact is company-wide.
This is the track almost nobody is talking about, and it may be the largest of the three. These are developers — or increasingly, non-developers — who combine enough technical fluency to work with AI systems and enough deep domain expertise to know which problems are worth solving in a specific market.
The dental practice management specialist is now a developer. The construction scheduling expert is now a developer. The insurance compliance analyst can now build tools instead of just using them. Their value isn't in token management or infrastructure. It's in their ability to point intelligence at the right problem, in the right market, with the right context. And that value goes up as intelligence gets cheaper, because cheaper intelligence makes more niche problems economically viable to solve.
The developer who writes competent application code but doesn't have deep systems expertise or deep domain knowledge is the most exposed. Not because AI will immediately replace them — that narrative is too simple — but because the value of generic code production is approaching zero at the same rate as the cost of tokens.
“I'm using AI to code” is not a career strategy. It's a treadmill. The three tracks above are career strategies.
Most engineering organisations are still structured around headcount — the bottleneck of the last sixty years. Full-time equivalents. Productivity measured (badly) in output per engineer. Hiring plans built around projected workload.
In a token-based paradigm, output is limited not by headcount but by the ability to convert intelligence spend into business value. An organisation with 50 engineers managing agents can outproduce one with 500 engineers writing code by hand — if the 50-person org has better specifications, better evaluation frameworks, better context engineering, and a higher token budget per engineer.
That's not exaggeration. It's already happening. OpenAI is now offering enterprise agent management, and rumoured pricing tiers run from $2,000/month for knowledge worker agents up to $20,000/month for AI researchers. Enterprise buyers are reportedly calling it cheap — because even at $20,000 a month, it's a fraction of the fully-loaded cost of the PhD researchers they'd otherwise hire.
Klarna provides the cautionary tale. They had a disastrous initial AI rollout — fired customer service staff, had to rehire them. But because they built AI tooling along that rocky journey, they're now seeing revenue per employee scale into seven figures, far above the SaaS average. When their CEO says “the world isn't ready for the impact AI will have on knowledge work,” he's not theorising. He's describing what happened to his own company.
A16Z's data shows this pattern spreading. AI-native companies run at 3-5x revenue per employee versus traditional SaaS. A $10 million ARR AI startup might operate with 15 people where a traditional SaaS company would need 55 to 70. That ratio will widen as tooling matures, eventually forcing larger organisations to restructure or accept a permanent productivity disadvantage.
There's a second-order effect here that matters more than the headcount story: what gets built changes when the cost of building falls.
Every enterprise has a backlog of projects that were never economically viable. The internal tool that would save 200 hours a year but cost 2,000 hours to build. The integration that would unlock a new revenue stream but require a team of four for six months. The niche market vertical that could work but couldn't justify the engineering allocation.
When tokens make building 10x cheaper, those backlogs become gold mines. The enterprises that recognise this will dramatically expand the scope of what they build, not just the speed. If you're only optimising for headcount reduction, you're playing the wrong game. Your competitors optimising for output expansion will eat you.
The market is splitting, but not along the axis most people assume. It's not big companies versus small companies, or incumbents versus startups, or “who can afford the most tokens.”
It's splitting along generalised scale versus specialised precision.
At the top: enterprises and well-funded AI-native companies competing on token volume, building horizontal platforms, running agents on broad workflows that every large organisation shares. Their advantage compounds with every model upgrade. Their moat is capital and infrastructure.
Across the enormous surface area of the rest of the market — which is expanding rapidly — builders win on specificity. The sharp angle. The niche market. The customer relationship that no amount of token spend can replicate. Their advantage compounds with domain knowledge. Their moat is distribution and trust.
Both sides benefit from the same trend: intelligence getting cheaper makes more things possible. Enterprises use the token paradigm to scale horizontally. Specialists use it to go deeper vertically. The new reality doesn't pick a winner between the two. It makes both strategies more potent and widens the gap between either strategy and the old model of hand-written code denominated in developer hours.
Goldman Sachs will spend more on inference than any startup. But Goldman can't sell AI-powered inventory management to a 50-location restaurant chain, because intelligence is purchasable but distribution is not. The startup playbook in the token economy isn't “raise more money, buy more tokens.” It's know a market so well that a $200/month subscription aimed precisely creates more downstream value than a $20,000/month agent budget pointed at the wrong problem.
There's been breathless speculation about when the first one-person billion-dollar company will emerge. Silicon Valley apparently has side bets running on whether it happens this year.
The speculation misses the point. The interesting insight isn't that one exceptional founder might hit a milestone. It's that when intelligence is purchasable by the token and the cost of building keeps falling, the minimum viable team for software is converging on one. Going independent is no longer a lifestyle trade-off. It's increasingly a rational economic choice for anyone with deep domain knowledge and sufficient AI fluency.
This also implies massive downward pressure on team sizes at large software companies. Amazon talked about two-pizza teams for years. We're heading toward half-pizza teams. The same dynamic that enables solo founders compresses team sizes everywhere.
When the fundamental unit of computing changes, everything downstream changes with it. Careers. Org charts. Compensation models. What's worth building. Who builds it. How it's funded.
The developers who thrive will be the ones who move decisively toward one of the three tracks: orchestration, systems building, or domain translation. The organisations that thrive will be the ones that treat token spend as a strategic lever rather than a cost line to minimise. The founders who thrive will be the ones who understand that distribution and domain expertise beat raw compute budget.
The ones who don't adapt? They'll spend the next five years optimising for a paradigm that no longer exists, wondering why the ground keeps shifting under their feet.
It's already shifting. The token economy doesn't care whether you're ready.
Sources: Andreessen Horowitz Enterprise AI Survey 2026 | Ed Zitron — Anthropic and Cursor AWS Costs | Deloitte — AI Token Spend Dynamics | TechCrunch — OpenAI Enterprise Agents | Business Insider — AI Demand Surge 2026