April 2026. Something strange is quietly unfolding in Silicon Valley.
Harvey, the legal AI company, was valued at $8 billion in December 2025. By March 2026, after its next funding round, it was worth $11 billion. Three months. Three billion dollars.
Anthropic's annualized revenue rocketed from the ten-billion range into the twenty-to-thirty-billion range — in a single quarter. The velocity caught even the most bullish analysts off guard.
And yet — here's the contrast you're meant to notice — according to Morgan Stanley's Q1 CIO survey, U.S. enterprise IT budgets are projected to grow just 3.7% this year.
So where is the money coming from?
If overall enterprise IT budgets are barely growing, then this money has to be taken from somewhere.
Who's being taken from?
Most analysts will answer without hesitation: it's legacy SaaS. It's Salesforce. It's Adobe. It's the old-guard software companies that sell seats by the head.
That answer sounds self-evident. It's practically become the consensus of the 2026 AI investment world. I thought the same thing, at first.
But when I actually pulled the question apart — and worked it from four completely different angles: budget data, valuation evidence, historical patterns, real-time signals — something unexpected emerged.
On the question of "is the application layer getting eaten" — every thread of evidence nods yes.
On the question of "who's eating it" — every thread of evidence points in nearly the opposite direction.
And that disagreement determines what you should be shorting, what you should be buying. It determines whether what you're watching is the twilight of SaaS, or the dawn of an entirely new species.
This episode, I want to walk you into that disagreement. On the surface, it looks like a valuation technicality. But dig deeper and it touches something much larger — the hidden assumption sitting at the bottom of every AI investment framework of the past three years. An assumption almost no one has ever thought to question.
Let's start with what the evidence unanimously agrees on, no matter which direction you approach from. This is the foundation. You have to accept it first, or the later disagreement won't make sense.
First: enterprise AI budgets have actually separated out.
What does that mean? Three years ago, if a CIO wanted to run an AI project, they had to scrape funding out of the "innovation budget" — the little slush fund that the boss defaulted to for tuition, for experiments.
Today, according to the latest VC surveys, only a single-digit percentage of AI spending still comes from innovation budgets. The overwhelming majority has been absorbed into IT department or business unit operating budgets.
Behind that number is a single sentence: AI has graduated from "experiment" to "line item."
But here's the problem: CIOs didn't suddenly get richer. Morgan Stanley's Q1 2026 survey shows overall IT budgets growing only 3.7% — while AI/ML as a priority has spiked to 17.7%, far ahead of the next priority, cybersecurity, at 10.7%.
What does that mean?
It means every dollar going into AI has been carved out of something else.
Server budgets cut. Consulting cut. Legacy software licenses cut. That money got freed up and redirected to AI.
Second: the cost of switching models has collapsed.
Three years ago, a company's choice between OpenAI and Anthropic was a strategic decision. Switching models was like swapping engines — engineers had to rewrite every prompt, redo every eval, retrain employee habits.
Today? The latest data from OpenRouter, the multi-model middleware company, shows that in April 2026, the top model by weekly token traffic isn't OpenAI. It isn't Anthropic. It's Xiaomi's MiMo-V2-Pro, out of China, with more than 20% share.
OpenAI plus Anthropic combined have fallen to just over 30%.
The switching latency — which used to mean several seconds of cold-start — has dropped to tens of milliseconds. Humans can't even perceive it anymore. Switching a model today is like switching a SIM card.
Third: the infrastructure layer is consolidating at terrifying speed.
Anthropic's annualized revenue went through an explosive climb in Q1 2026 — from the ten-billion range into the twenty-to-thirty-billion range. Behind that number is enterprise customer count doubling in weeks.
Amazon's management emphasized in its latest earnings that AI has become a core pillar of AWS revenue. Industry estimates put the annualized run rate somewhere around $15 billion.
Meta signed a single compute deal with CoreWeave worth $21 billion.
This is real money. The consolidation of the infrastructure layer is not up for debate.
Fourth — and this is the most unsettling one: the vital signs of traditional SaaS are structurally deteriorating.
There's a core metric in this industry called Net Revenue Retention — NRR. In plain terms, it measures how much more an existing customer automatically spends each year. In the SaaS 1.0 era, this metric was the foundation that made the whole business model work.
Here are the numbers:
The median NRR for publicly traded SaaS companies has fallen from 117% in early 2021 to around 106% today. And the latest statistics for private SaaS companies show a median net revenue retention of only about 101%.
What does that mean? 101% means existing customers are essentially not increasing their spending at all. The old SaaS playbook of "sell once and collect for five years while sitting still" — that era is ending.
These four observations are the floor. No matter which thread of evidence you pull on, the conclusion is the same. This is the foundation.
The disagreement isn't in the foundation. The disagreement is in the interpretation. Legacy SaaS is dying. But who is it dying at the hands of?
Let me show you two completely different stories.
Story One: the Infrastructure-Layer-Draining thesis.
This story says: what's eating SaaS is the infrastructure layer.
The logic runs like this: every AI call costs an inference fee — paid to OpenAI, paid to Anthropic, paid to AWS. These are hard costs. AI-native application companies have to swallow them.
Multiple industry surveys on AI-native gross margins show that many of these companies are running at 30-50% gross margins — compared to 75-85% for traditional SaaS, their margins are essentially cut in half.
This is the famous "inference tax."
By the logic of this story, we are witnessing the death of SaaS. The industry has even given it a biblical name: SaaSpocalypse — the Judgment Day of SaaS.
Story Two: the alternative reading.
But if you look a little longer — wait. This explanation sounds self-evident. But when you study the evidence carefully, it has a fatal hole.
If it's really the infrastructure layer that's stealing the money, then why are vertical AI applications not just surviving, but thriving beyond belief?
Let me show you three companies.
Harvey, legal AI. Clients are top-tier law firms. Investors openly describe its client budgets as "growing the more they use it" — retention and expansion are both off the charts.
Cursor, coding AI. Multiple investors publicly compare its growth curve to that of early GitHub Copilot — one of the fastest growth arcs of the AI era.
Abridge, medical transcription AI. It has already entered the workflows of multiple large hospital systems through Epic's marketplace. For physician rounds documentation, it's basically become default-on.
These three companies, according to the infrastructure-draining logic, should have been crushed by the inference tax long ago. Yet not only are they alive — they're among the most profitable species in this entire wave of AI.
So who's actually dying?
What's dying is Salesforce adding an AI module to its existing Sales Cloud and charging an extra $30 per seat — that kind of "bolt-on monetization."
What's dying is Microsoft Copilot M365 — over a year into launch, multiple sell-side estimates tell us its real paid penetration rate is still stuck in single digits.
What's dying are the companies selling general-purpose AI tools for under $50 a month. Industry statistics show this category retains barely 20-something percent of its revenue. Which is to say: most customers, within a year, either downgrade or stop using it entirely.
So the problem isn't "the application layer is being wiped out wholesale." The problem is that the application layer is fracturing internally — and what's actually eating legacy SaaS's lunch isn't the infrastructure layer at all. It's another species, one we haven't even named yet.
What is that new species?
Let's run a thought experiment.
Imagine you're a Sales Director at a mid-sized company. Three years ago, your workflow looked like this: Salesforce for customer management, Outlook for email, Slack for communication, Tableau for data.
Every tool, you pay per seat. Every software company charges per seat. This is the world of SaaS 1.0 — pricing by the head. Every SaaS company lives inside the same accounting logic.
Fast-forward to 2026. Your workflow might look like this:
You open an Agent and tell it: "Follow up with everyone who attended last week's demo."
The Agent goes into Salesforce on its own, pulls the data, drafts the emails, books the calendars, generates the reports. You just review the work after it's done.
See the problem?
In this new workflow, between you and Salesforce, there's a new layer. Salesforce is no longer the product you directly use — it has become a tool that the Agent calls.
This "middle layer" is what nearly every analyst has missed — the Agent Orchestration Layer.
And who occupies this layer?
Salesforce's own Agentforce. Microsoft's Copilot Studio. ServiceNow's Now Assist. Google's Workspace AI.
At this moment, value is quietly changing hands.
Salesforce has shifted Agentforce's pricing from per-seat to per-conversation — charged by the interaction. This is a revolution in accounting terms. It sidesteps every comparability benchmark of traditional SaaS and, in an instant, breaks every legacy valuation model the analysts use.
Because Agents aren't priced by "how many employees are using this" — they're priced by "how much work got done."
If this trend holds — if in the next 18 to 36 months, the Agent Orchestration Layer actually crystallizes — then the "infrastructure vs. application" binary we're debating today is wrong.
Value will converge to an entirely new tier — one we are only just beginning to name.
Those who bet on "short SaaS, long compute" may find their long side is right (compute will keep going up), but their short side chose wrong. You shouldn't be shorting all SaaS. You should be shorting the horizontal-bolt-on kind — and going long the ones that actually have Agent orchestration capability.
Let me step back and tell you something that shook me when I first saw it.
There's an argument buried in the research that I keep coming back to:
Across the past three IT cycles, infrastructure-layer consolidation has never led to application-layer fragmentation.
Let's walk through the history quickly.
1960s to 1970s, the mainframe era. Infrastructure was consolidated in IBM's hands — IBM alone controlled nearly 70% of the market. What about applications? Also consolidated in IBM and its "Seven Dwarfs." Simultaneous consolidation.
1990s, the client-server era. Infrastructure consolidated under the Wintel alliance and Oracle. Applications? SAP in ERP. Siebel in CRM. PeopleSoft in HR. Each monopolizing a vertical. Simultaneous consolidation.
2010s, cloud computing and SaaS 1.0. Infrastructure consolidated into AWS, Azure, GCP. Applications? Salesforce in sales, Workday in HR, ServiceNow in IT service management. Still simultaneous consolidation — just lagged by 5 to 7 years.
Three cycles. One unified pattern: after infrastructure layer consolidation, the application layer consolidates too. Just with different lags.
If AI in 2026 really is breaking this rule — infrastructure consolidated but applications dispersed — then it would be the first exception in the history of IT.
Exceptions are possible. But the burden of proof should lie with the party claiming the exception.
This is why we should be highly skeptical of the popular assumption that "the application layer will stay permanently fragmented."
It's not saying the assumption is wrong. It's saying: if you want to claim something that contradicts three historical cycles of unified pattern, the evidence you bring to the table has to be a lot stronger than "CIO budgets are tight" and "model switching costs have dropped."
So far, no one has put that evidence on the table.
Put all the threads together, and the real picture looks like this:
The AI value chain isn't fracturing into two layers. It's fracturing into three.
Layer one: infrastructure. Continuing to consolidate toward the top. NVIDIA, CoreWeave, hyperscale cloud, sovereign AI capacity. This story is still ongoing. No reversal.
Layer two: applications — fracturing internally, not wiped out wholesale.
Who are the winners? Those with data flywheels, workflow lock-in, and vertical depth. AI-native companies. Harvey, Cursor, Abridge — they're not just surviving, they're the fastest-growing species in this entire wave. Primary markets are willing to value them at tens of times ARR.
Who are the losers? The horizontal, generic, bolt-on-monetization plays. Microsoft Copilot M365 can't get penetration off the ground. Salesforce Einstein's $30-per-seat upcharge keeps getting rejected by customers. These aren't cases of "infrastructure stealing the budget" — these are cases of companies whose added value simply isn't fresh enough.
Layer three: a brand-new Agent Orchestration Layer is rising.
This is a species that only really emerged over the past six months. Mostly captured by the hyperscalers. Salesforce Agentforce. Microsoft Copilot Studio. ServiceNow Now Assist. They are seizing the value-capture point of the traditional application layer.
This layer may well be the strongest link in the entire AI value chain over the next 18 to 36 months.
And it is almost entirely absent from the thesis statements of every mainstream analyst.
At this point, I want to leave a few questions with you.
Question one: is the data flywheel actually a real moat?
Harvey wins because it has accumulated a massive legal corpus and built unique fine-tuning data. Cursor wins because it knows every programmer's code-completion preferences.
But can this kind of moat be replicated laterally? Or is it only a handful of particularly lucky verticals that will ever form one?
If, in the next 12 months, three or more vertical AI companies can demonstrate an equivalent moat, then the "vertical AI concentration" thesis gets locked in. If not — then today's stars might just be cyclical phenomena.
Question two: how will Agent Orchestration Layer monetization actually converge?
Salesforce currently has three pricing models for Agentforce:
- Per conversation.
- Per outcome.
- Per credit.
Which one becomes the industry standard?
This looks like a technical detail. But it directly determines what the valuation multiple for the Agent Orchestration Layer ultimately lands at. Per-seat legacy SaaS trades at 8 to 12 times ARR. Per-conversation? No one knows.
Question three — and this is the one I'm personally most interested in — is China the leading indicator for the U.S. market?
China's AI application layer has, from day one, been a landscape of multi-model concurrency, price wars, and application-layer fragmentation. Especially after DeepSeek.
If the U.S. market is evolving in that direction, then what China looks like today may well be what the U.S. looks like 12 months from now.
And coverage of the Chinese market is precisely the area where Western analysts are seriously absent. This may be the most undervalued information source in the entire current wave of AI investing.
Back to the question we started with.
Where is the money coming from? The lunch that's being eaten — who is eating it?
Different analytical paths led to different answers. And the one that was the least mainstream, the most contrarian to market consensus — the rise of the Agent Orchestration Layer — may be the one closest to the truth.
But more important than the answer itself, is what this whole exercise reminds us of:
When something becomes consensus, it usually stops being alpha.
The 2026 AI investment consensus is the "infrastructure vs. application" binary. This consensus is correct in many ways. But its resolution may already be far too coarse.
The real opportunity is in the seams of the consensus. In the hidden assumptions that everyone takes for granted but no one has ever questioned.
And to find those seams, what you need isn't more data. What you need is — the willingness to ask the question everyone else thinks there's no point in asking anymore.
That's all for this episode.
Next episode, I want to talk about China's AI application layer specifically. That may well be the most undervalued piece of the puzzle in this entire wave.
See you next time.
All data cited in this episode is accurate as of April 14, 2026.
No comments:
Post a Comment