Part 3: Moats
Part 2 made the distinction between utility and capture. This chapter applies that distinction to the strongest current ownership claim in AI: that a few centralized model providers will remain the permanent home of intelligence. I think that story is much weaker than it looks, and it sets up the SaaS consequences explored in Part 4.
Intelligence alone is not the moat.
Intelligence will flow through a small number of centralized model providers because they have the best models and the infrastructure to run them.
That sounds reasonable today. It may even be true today, but I do not think it is the final form.
OpenAI, Anthropic and the other frontier labs are betting on a world where intelligence remains scarce, expensive, centralized and hard to reproduce. They have the models. They have the talent. They have the infrastructure relationships. They have the APIs. They have the mindshare. They have the brand.
That is a powerful position, but it is not automatically a durable moat.
A model lead is not a moat. It is a lead, and leads decay.
Scarcity is the current business model
The economics of the frontier labs are built around scarcity.
The best models are expensive to train. They are expensive to serve. They require huge amounts of compute, power, memory, networking and operational expertise. They sit behind APIs because very few companies can afford to train or run them at scale.
That creates the feeling of a moat.
If everyone needs intelligence and only a few companies can provide it, then those companies should capture the value.
That is the simple version of the story. It is also the version I am most suspicious of.
Technology attacks scarcity.
That is what it does.
Every layer of the AI stack is currently under pressure to become cheaper, smaller, faster and more distributed. Model architecture improves. Inference improves. Quantization improves. Distillation improves. Hardware improves. Memory improves. Edge devices improve. Developer tooling improves. Open models improve. Specialized models improve.
- The question is not whether frontier models will remain impressive. They will.
- The question is how many tasks actually require the frontier.
That number may be much smaller than the market currently assumes.
Yesterday’s frontier becomes tomorrow’s local model
Computing history repeatedly moves capability from centralized infrastructure to local devices.
Mainframes mattered. Then personal computers mattered. Then servers and cloud mattered. Then mobile and edge mattered. The pattern is not a clean replacement. It is a migration of the default.
The same thing is likely to happen with AI.
Today’s expensive cloud capability becomes tomorrow’s cheap local capability. Not all of it. Not the moving frontier. But enough of it to change the economics.
A local model does not need to be the smartest model in the world to be valuable.
It needs to be good enough for the task, cheap enough to run continuously, private enough to trust, fast enough to feel ambient and integrated enough to disappear into the workflow.
That is a different design target from the frontier leaderboard.
Most everyday AI tasks are not grand acts of genius.
They are repetitive, contextual and close to the user:
- summarize this thread;
- rewrite this paragraph;
- classify this email;
- extract these fields;
- explain this error;
- draft this reply;
- organize these notes;
- search my files;
- help with this form;
- suggest the next step;
- generate a small script;
- automate this local workflow;
- check this document against a known policy.
Those are not all frontier tasks.
They are context tasks.
The advantage goes to whoever owns the context, the device, the operating system, the workflow, the permissions and the user relationship.
That is often not the model lab.
The future is hybrid, not API-only
I am not arguing that cloud AI disappears.
That would be too simplistic.
The frontier will matter. There will be tasks where the best available model is worth paying for. Complex reasoning, scientific work, high-end coding, multimodal generation, long-context synthesis, difficult planning, regulated review, specialist analysis and heavy agentic workflows may all require cloud-scale models for a long time.
But the default will shift.
The mature AI stack is likely to be hybrid:
- local models for private, low-latency, ambient and repetitive tasks;
- device and operating system models for personal context;
- enterprise-local or tenant-local models for sensitive business data;
- specialized models for narrow domains;
- frontier cloud models for escalation;
- orchestration layers deciding what runs where.
Apple’s Apple Intelligence is already explicitly framed around on-device processing with Private Cloud Compute, and Microsoft is using Copilot+ PCs to push local NPU-based AI further down into the device. That does not prove the hybrid future is settled. It does show that the market is already moving beyond the fantasy that every meaningful act of intelligence stays behind one remote endpoint.
That is very different from a world where every meaningful unit of intelligence flows through a handful of remote APIs.
The API remains useful.
It stops being the entire architecture.
Interactive view
API-only versus hybrid AI
Toggle between the current centralized market story and the hybrid stack this chapter argues for. The point is not that the API disappears. It is that it stops being the whole architecture.
Selected architecture
API-only story
A few remote model providers remain the default path for nearly all meaningful intelligence.
Takeaway:
This is the current market story.
Default surface
Users go to one destination product or API endpoint to reach intelligence.
Where context lives
Important context must be shipped outward to the model provider.
Economic logic
Scarcity and API access look like the main source of durable rents.
What becomes durable
The moat is assumed to sit mainly in frontier model quality and infrastructure access.
Distribution beats model purity
In immature markets, people overvalue the pure technology layer.
They assume the best technology captures the market.
Sometimes it does.
Often it does not.
Distribution matters. Defaults matter. Workflow matters. Trust matters. Data matters. Integration matters. Switching costs matter. Procurement matters. Regulation matters. Developer ecosystems matter. Existing user behaviour matters.
| Claimed moat | Why it looks strong today | Why it weakens in the mature AI economy | What becomes more durable |
|---|---|---|---|
| Frontier model quality | The best model is still visibly better | Many commercial tasks only need good-enough intelligence | Context, workflow, permissions, user trust |
| Centralized API access | It is the easiest way to ship intelligence quickly | More tasks become local, embedded, hybrid, or routed | Orchestration and distribution |
| Brand mindshare | Users currently go to the model destination on purpose | Mature AI becomes ambient and less model-visible | Default surfaces inside devices and operating systems |
| Training scale alone | Scarcity looks like a moat in the boom phase | Scarcity gets attacked by diffusion, distillation, and specialization | Control of execution and customer relationship |
That is why device and platform companies are so important.
Apple can put models on the device, inside the operating system, close to photos, messages, mail, calendar, files, apps, identity and privacy controls.
Microsoft can put models inside Windows, Office, Teams, GitHub, Azure, identity, security tooling and enterprise administration.
Google can put models inside Android, Search, Chrome, Workspace, Gmail, YouTube, Maps and cloud infrastructure.
Samsung, Qualcomm, AMD, Intel and others can participate through devices, NPUs and local compute.
The model lab may have the better model today.
The platform company has the user, the context and the default path.
That matters because most users do not want to choose models. They want features that work.
Nobody asks which machine learning model sorted their inbox. They care whether the inbox is useful.
Nobody cares which ranking algorithm surfaced the right photo. They care that the photo appears.
Nobody cares which speech model transcribed their note. They care that the note is there.
As AI matures, model identity may become less visible to the user.
That is a dangerous future for pure model brands.
ChatGPT as the AOL of AI
I do not mean this as an insult.
AOL was historically important. It made the internet accessible to millions of people. It gave people email, chat, content, billing, community and a safe front door into a confusing new world.
For many people, AOL was the internet.
Until it was not.
The mature internet did not belong to AOL. It belonged to a different set of layers: broadband, browsers, search, ecommerce, cloud, mobile operating systems, social networks, app stores, advertising platforms and streaming services.
ChatGPT may play a similar role.
It made AI legible. It gave ordinary people a way to touch the new paradigm. It turned artificial intelligence from a background technology into a direct consumer experience.
That is enormous.
But being the first widely understood gateway does not guarantee ownership of the mature economy.
The mature AI economy may not be a chat box.
It may be invisible.
It may live inside operating systems, tools, devices, workflows, enterprise control planes and generated software.
It may be everywhere and nowhere.
That would make ChatGPT historically important even if the final profit pool migrates elsewhere.
Model quality diffuses
The other danger for the API moat is diffusion.
Model quality spreads faster than people expect.
The frontier may keep moving, but the trailing edge becomes stronger. Open models improve. Smaller models improve. Techniques move through papers, code, model releases, talent movement and imitation. Distillation moves behaviour from larger models into smaller ones. Synthetic data improves training. Specialized models beat general models in narrow domains.
This creates a compression effect.
The best model may remain meaningfully better at the hardest tasks.
But many commercial tasks only require good-enough intelligence.
Once good-enough intelligence is cheap, local or bundled, pricing power falls.
This does not destroy the frontier labs. It changes what they can charge for and how often they are needed.
The pure API model becomes more like premium escalation.
Useful. Valuable. Important.
Not universal.
Enterprise buyers will resist permanent dependency
Large enterprises do not like unnecessary dependency.
They tolerate it when there is no alternative. They pay for Salesforce, SAP, Microsoft, ServiceNow, Oracle, Workday, Atlassian and cloud platforms because those products are deeply embedded and difficult to replace. But every dependency becomes a negotiation point.
AI increases the stakes.
Sending sensitive enterprise context to an external model provider is not a trivial architectural decision. It raises questions about privacy, security, regulation, auditability, data residency, IP leakage, vendor lock-in, cost predictability and operational control.
Some companies will use external APIs heavily.
Others will prefer tenant-local, private-cloud, self-hosted, open-model or hybrid approaches.
The more strategic AI becomes, the more enterprises will want control.
This again weakens the idea that a few centralized model APIs become the permanent default for business intelligence.
Enterprises may buy frontier capability.
But they will also build internal AI control planes.
They will route tasks. They will classify data. They will decide what runs locally, what runs in a private tenant, what runs through a managed API and what requires human approval.
In that world, the model provider is a supplier.
Not necessarily the owner of the workflow.
Infrastructure can be real and still overbuilt
The AI infrastructure buildout may prove useful over time.
That does not mean every investment is well timed.
The dotcom era overbuilt parts of the internet’s physical and financial infrastructure. Some of that capacity became useful later. Some of it was bought cheaply after the crash. Some investors lost everything while later companies benefited from the assets.
AI may repeat that pattern.
Data centres, chips, power contracts, cooling systems, networking, memory and specialized hardware may all be needed for the mature AI economy.
But if the market prices infrastructure as if centralized frontier inference will grow without interruption, while actual demand shifts toward local, hybrid and cheaper models, then the economics change.
The infrastructure can be useful and still disappoint the investors who funded it.
This is a recurring theme in paradigm shifts.
The asset can survive.
The capital structure cannot.
The model labs need to become something else
OpenAI, Anthropic and their peers are not doomed.
That is not the argument.
The argument is that model quality alone is unlikely to be a permanent moat.
To capture durable value, model labs need to become one or more of the following:
- consumer platforms;
- enterprise platforms;
- operating layers;
- developer ecosystems;
- trusted infrastructure providers;
- device or OS partners;
- orchestration platforms;
- data and workflow owners;
- governance and safety layers;
- marketplaces for agents, tools or services.
They know this. Their strategic behaviour already shows it.
The race is not just to build a better model.
It is to avoid becoming a commodity supplier of intelligence.
But that is hard, because the companies best positioned to own distribution and context are often not the model labs. They are Apple, Microsoft, Google, Amazon, Meta, Samsung, device makers, enterprise software incumbents, cloud platforms and eventually AI-native operating companies.
The model lab starts with magic.
The platform company starts with the customer.
In mature markets, the customer relationship is usually the better place to stand.
The API story breaks when intelligence becomes ambient
The current AI economy is still destination-based.
You go to ChatGPT. You go to Claude. You open the app. You call the API. You paste the document. You ask the question.
That feels natural because the technology is still young.
The mature AI economy will be more ambient.
The intelligence will sit inside the editor, the browser, the operating system, the CRM, the ERP, the calendar, the inbox, the file system, the call centre, the warehouse, the IDE, the design tool, the finance process, the compliance workflow and the business control plane.
When that happens, the user does not experience AI as a model endpoint.
They experience it as a capability of the environment.
That is the point at which the API moat weakens.
The question stops being:
Which model is best?
And becomes:
Which system has the data, permissions, workflow, user trust and execution rights to act?
That is a much harder question for pure model labs.
Intelligence becomes a component
The final form of AI may not be a product category called AI.
It may be a component in almost every product category.
That is what happens when technologies mature. Electricity disappeared into devices. Networking disappeared into software. Databases disappeared into applications. Cloud disappeared into product delivery. Machine learning disappeared into recommendations, fraud detection, ranking, search and logistics.
AI may disappear too.
Not because it becomes less important.
Because it becomes assumed.
That is the danger for companies whose valuation depends on intelligence remaining visible, scarce and separately priced.
In the end, the moat may not be the model.
It may be the place where the model runs, the data it can see, the actions it is allowed to take, the workflow it belongs to and the trust structure around it.
That is why I think intelligence as an API is a transitional phase.
It is powerful.
It is necessary.
It is not the final ownership layer.
In Part 4, I follow that logic into enterprise software and ask what happens when software itself becomes easier to generate around the business.

















