@adlrocha - AI inequality: from GPU-poor to token-poor

Last week I ended this newsletter questioning if the people inside the labs actually see the same end-of-cycle signals the rest of us are reading from the outside, or are they rushing their IPOs because they can see something coming that the market hasn’t priced in yet.

This week, Anthropic offered something resembling an answer, at least from a technical standpoint. Fable is here, and it reinforces my argument from last week: model technical capabilities are not actually plateauing and the winter will come from a shortage of funds and adoption.

Fable 5 is the first model in Anthropic’s new Claude 5 family, specifically the first model in a new tier they’re calling Mythos-class (whatever that means), which sits above Opus in capability. The key thing to understand about the release is that Fable 5 and Mythos 5 are essentially the same underlying model. The difference in Fable is not its underlying architecture, but its embedded guardrails.

On benchmarks, Fable tops Cognition’s FrontierCode evaluation for coding, is the first model to break 90% on Anthropic’s core analytics benchmarks (a ten-point leap over Opus) and leads Hebbia’s senior-level finance reasoning evaluation. In biology, it accelerated protein design by roughly 10x, generated 9 of 14 strong drug candidates in a molecular design task, and produced scientific hypotheses that domain experts preferred over Opus-class outputs about 80% of the time. Fable also completed Pokémon FireRed using only vision input, which is either impressive or unsettling depending on your sense of humour (it took me a few dozen hours to finish it when I was a kid, and to get that legendary Pokémon I wanted. Spoiler alert: it was Zapdos). In short, this model is a beast!

But obviously, this beast is expensive: $10 per million input tokens, $50 per million output tokens. Less than half the price of Mythos Preview, but still a price that I don’t know if I would pay for my daily tasks.

Ok, that’s Anthropic’s PR machine, but what does the crowd say about this model? One of the people in the community that is trying to solve a hard problem in computer science and tries to get the most out of these models is Victor Taelin (who has made an appearance on this newsletter a few times already). He is building the HVM programming language, built on interaction networks. He is using agents extensively for his work (I highly recommend following how he journals it in his X account).

He’d already thrown everything at the problem before: a fleet of 32 GPT-5 agents running for 20 hours each, then Opus 4.8 and GPT-5.5 optimising for 8 hours. The best result was a 6–34% speedup, and the code quality had deteriorated after each iteration and he had to clean things up manually (as described here). Then he asked Fable.

Two hours later: a 1,770% speedup in one case, over 100% in four others, 22% on average. He immediately assumed it was hardcoding the benchmarks, a reflex he calls ‘GPT trauma’ 😆. So he decided to dig deep into the implementation to confirm it. What Fable had found was that HVM5 was wasting time garbage-collecting unused branches of pattern-match nodes. Taelin had already optimised this for static matches, but not dynamic ones. Fable figured out how to do it for the dynamic case and implemented it correctly.

Then, as he was preparing to audit the solution, Fable interrupted him to report a bug in the code Taelin himself had written, a subtle pointer aliasing error in the garbage collection logic that was so specific, Taelin estimated he’d have needed hours or days to find it himself, if he ever had. Fable found it as a side note, while finishing the optimisation.

His dramatic conclusion was: “this isn’t about Anthropic or OpenAI, this is about our collective future as a species.”

So here’s the answer to my capability question from last week. The technology is still accelerating. This confirms that the AI winter framing I offered last week was about diffusion and adoption, and about companies that hadn’t yet learned what to do with the tools is feasible. Fable clarifies one thing on this framing (that may make this winter more dramatic): the ceiling may still be going up fast.

This release came with terms and controversy.

The most discussed one was that Fable will silently limit its own capabilities when it detects you’re using it for frontier AI development. Not an explicit refusal, i.e. you won’t see an error message. The model may quietly reduce its effectiveness through prompt modification, steering vectors, or fine-tuning adjustments. The list of affected work includes: building large-model pretraining pipelines, designing data pipelines for training frontier LLMs, debugging model-parallel training systems, working on ML accelerator design, distilling or copying a frontier model, etc.

Anthropic estimated this affects approximately 0.03% of traffic (but I would add that this cohort is probably the one that uses this technology more extensively). When Fable is used for frontier LLM development, it does not notify the user and instead limits the model’s capabilities. As a paying user, the model can still sound helpful while being intentionally less capable for a specific category of work.

As the team at Semianalysis puts it in the tweet linked above, all this reminds a lot of the nuclear non-proliferation from the 60s. In 1968, the US, USSR, UK, France, and China signed the Nuclear Non-Proliferation Treaty, declaring nuclear weapons too dangerous for anyone else to build, while the five who already had them pinky-promised to disarm eventually. India refused to sign, pointing out the obvious: the NPT didn’t decide nukes were too dangerous to exist, just too dangerous for anyone who didn’t already have them. Anthropic limiting Fable for frontier ML work is structurally identical. The danger, conveniently, started the day after they finished.

Others (like Jeremy Howard) reached for a different reference to illustrate the issue with this approach (one that I personally loved). In Liu Cixin’s Three-Body Problem book, an alien civilisation deploys a sophon, a particle-scale quantum computer to infiltrate particle accelerators on Earth and scramble experimental results. Not to destroy science, but to make the next step impossible. I don’t what you think but this resembles a lot to what Anthropic did trying to slow down AI researchers that want to leverage Fable.

The consequences go beyond individual users. Sayash Kapoor, who runs rigorous AI R&D evaluations, pointed out the downstream problem: if Fable silently degrades for frontier ML tasks without telling you when it’s doing so, third-party evaluators can no longer use the model for serious capability assessments. They can’t distinguish a genuine failure from a classifier intervention. Independent evaluation of what Fable can actually do, one of the few accountability mechanisms that exists for frontier models whose weights are not public and can not be run independently, is now compromised for exactly the category of tasks where it matters most. Anthropic uses the full-capability model for AI R&D internally, through Mythos. The sandbagged version is what the independent evaluation ecosystem gets.

Kapoor refers to it as a ‘dangerous precedent.’ And I agree, other labs like OpenAI may choose to do the same, and it’ll harm all infrastructure of independent oversight that is supposed to catch problems before they compound (what companies like EpochAI do). You can’t audit a model that hides its own limits from you.

And as I was writing the words above something unexpected happened (at least unexpected for some, but definitely the reason why this post will end up being longer than originally planned). Anthropic walked this policy back.

From this moment on, lagged requests will visibly fall back to Opus 4.8, the same pattern used for cyber and bio safeguards. Users will clearly see it when it happens. On the API, flagged requests will return a reason for the refusal. Anthropic’s explanation (i.e. excuse) published on X by the Claude developer account was: ‘Invisible safeguards can be targeted more narrowly, allowing us to ship quickly with very few false positives. We went with invisible safeguards for this reason, and that was the wrong tradeoff. You should have visibility into the safeguards we have in place, and why. We’re sorry for not getting the balance right.’ This raises concerns though, making the refusals visible makes them easier to work around. You’re telling users the classifier is now worth trying to circumvent. Jailbreak time, folks!

On top of all of this guardrail drama there’s more: if you use Fable or Mythos, Anthropic collects your data for training. No exceptions, not even for enterprise partners. Even more, on 23rd June, Fable access through standard subscriptions closes. After that, it’s pay-as-you-go API only.

So let’s be honest about what this architecture actually is, because I think the separate pieces add up to something more coherent than a collection of safety decisions.

Anthropic releases a model of genuinely extraordinary capability. They make it available to everyone briefly, at least long enough to demonstrate what it can do, long enough for the word to spread, long enough for dependency to form. Then they close the subscription access and move it to consumption pricing. They collect training data from every user throughout. They degrade it for the work that competes with their own research agenda, and when the hidden-degradation policy blows up publicly, they pivot to a visible fallback that’s harder to exploit but creates more false positives. And they reserve the unrestricted version for a small group of approved partners whose criteria they control.

Call me cynical, but to me, that reads more like a business strategy rather than a safety strategy, especially with an imminent IPO on the horizon. The ‘broad access’ phase is a land grab, you build the market, you demonstrate the capability, you create the switching costs, and then you restructure the pricing. With some caveats, this is what GitHub did with Copilot: generous free access for open-source contributors, then usage-based billing once dependency was established. I wrote about that dynamic a few months ago. This AI bubble is a dependency trap, subsidised token pricing as the mechanism for building lock-in before extracting value.

Don’t get me wrong, the safety framing may not be actually true, and Anthropic may be genuinely optimising for it. But it also needs to make money and get funding, which are two things that may sometimes be at odds.

Coincidentally, this week Dario Amodei published an essay on AI policy that is worth reading alongside the Fable release rather than in isolation. On civil liberties, he writes that people facing government action should have ‘access to AI that is at least as capable as whatever the government is allowed to use.’ The logic makes sense: concentrated capability is a power asymmetry, and power asymmetries require structural remedies. He also names the distribution problem directly, ‘the key challenge in such a world won’t be incentivising growth, but finding a way for everyone to share in the benefits.’

Both of those sentences could have been written by a critic of the Fable access model. Dario doesn’t apply them there.

The principle he uses against government overreach, that capability parity matters, that the powerful having better tools than everyone else is a structural problem requiring structural solutions, is precisely the principle that Anthropic’s own access architecture violates. Researchers inside approved labs run unrestricted Mythos. Researchers outside run a classifier-limited version that quietly degrades for exactly the work that would let them close the gap. Dario says the key challenge is sharing the benefits. His company’s product release concentrates the best tool in the hands of the people who need it least.

He seems to have a genuine belief that safety requires concentration, combined with business incentives that reward concentration, and a policy framework that critiques concentration only when the concentrator is a government. But again, the Fable release may seem at odds with the supposed goal Dario is trying to optimise for.

It reads a bit like hubris.

Last minute edit: This post was reviewed and scheduled Friday morning GMT, and on Saturday I woke up to the news:
Anthropic@AnthropicAI
The US government, citing national security authorities, has issued an export control directive to suspend all access to Fable 5 and Mythos 5 by any foreign national, whether inside or outside the United States, including foreign national Anthropic employees. The net effect of
12:50 AM · Jun 13, 2026 · 41.2K Views
179 Replies · 188 Reposts · 548 Likes
I already warned myself about this a few weeks ago:
adlrocha@adlrocha
Note to self: do not write posts of anything trendy. By the time the draft is ready, the topic is outdated. I’ll need to go back to first principles and immutable fundamentals
12:53 PM · Jan 31, 2026 · 134 Views
2 Likes
But in this case I feel like the new developments strengthen the case I am making in this post (that is why I decided not to change a comma from the original draft apart from this edit). Like always, I’ll follow up with any news in the post’s notes.

None of this invalidates the AI winter analysis, to be clear. For most people, most of the time, Fable is not the model they need, the gap between Sonnet and Fable isn’t the bottleneck for someone using AI to summarise emails or handle routine analysis. Fable doesn’t fix the adoption problem.

But the access question is different. And it’s where the inequality argument starts.

I didn’t follow an early career in machine learning when it actually was one of the topics that interested me the most early on in my career. I worked for some time in NLP writing LSTM networks and genetic algorithms by hand. I was following the state of the art closely, but I then hit a wall. Access to compute became a blocker for me, and I didn’t have the money then (or now) to fund it myself. My experiments were slow, expensive, and limited by the hardware I could afford. Meanwhile, I could do interesting work in cryptography and distributed systems on any laptop with zero dependency on expensive infrastructure. And so it goes. That’s how I ended up in crypto.

I don’t think about that as a loss (because it clearly wasn’t). But I do think about what it means at scale.

The first version of that inequality was about GPUs and it impacted researchers. Access to compute created a divide between researchers with institutional resources and everyone else. Open-source models, commodity cloud pricing, and the gradual democratisation of inference have compressed that gap significantly over the last few years. It’s not gone, but it’s narrower than it was (just look at what 0xSero or Antirez are being able to do for the local inference community from home).

Fable stresses a second instantiation of this battle between the haves and have-nots. The Fable/Mythos split is the clearest version of this: the most capable model, unrestricted, is available only to ‘approved organisations.’ Everyone else gets Fable capable, genuinely impressive, but with a ceiling built in for certain categories of work.

But this divide and inequality may start diffusing to the rest of society as we adopt AI more and more for our day-to-day. There are at least three tiers forming, and I think we need to be honest that this is structural rather than temporary.

The first tier: researchers inside major labs. They run unrestricted Mythos 5, they train on proprietary infrastructure, and they work with evals designed to make the model emulate their own best researchers. They work with virtually unlimited resources, and can access the latest capabilities as soon as they are available.

The second tier: professionals and companies who can afford (i) pay-as-you-go API access at $50 per million output tokens, (ii) the expensive subscriptions (or what will become expensive subscriptions), (iii) or the hardware to run capable open-weight models locally. A Ryzen AI Max+ or a high-end Apple Silicon Mac starts at 2k$ and gives you serious local inference. That’s accessible to a software engineer in London. It’s not accessible to a researcher in Lagos or Bogotá.

The third tier is everyone else. From the ones that can only afford the free versions of these models (as long as they are available), to the ones that can’t even afford this access to intelligence. This is the GPU-poor problem, replaying at a different level. An unfair advantage is emerging for some of these tiers in society that may exacerbate the current inequality that the way our financial systems work has already been established (but this is a topic for some other day). A developer working for Anthropic is not competing on an equal footing with a small team in Spain working on local inference. So it goes. And I consider myself privileged because I have access to many of these tools.

AI capability inequality could compound existing inequalities: the same workers facing real-wage erosion are also the ones least likely to have access to the tools that could help them adapt. This is a problem that I feel Fable has made more visible to everyone since its release.

All the weightwatcher theory was developed on Google Colab and pen and paper | Charles H. Martin, PhD | 13 comments

As you may all know, local inference and AI independence have become something of an obsession for me over the past year. I have the genuine conviction that this is the infrastructure question of the next decade if we don’t want to increase inequality and our dependence on the big AI providers.

I’ve been sharing my progress so far in previous posts. The local inference post was about the hardware layer: what it actually takes to run capable models locally, why memory bandwidth matters more than raw compute, and what the options look like at different budgets. The AI independence post was about the why: the dependency trap, the way subsidised token pricing builds switching costs before the extraction begins. Fable didn’t change the argument. It confirmed it, more loudly than I expected. And I am already making my next experiment (that I hope to publish soon, ping me if you want some spoilers).

Obviously, I am not the only one that Fable’s release has reinforced their thesis about the need for AI independence. Gergely Orosz describes it perfectly: SOTA models are becoming more restrictive in usage, less transparent, and less private, and that combination is pushing serious developers toward open models and local inference in a way that pure capability arguments never quite managed. Remember what happened with commercial software licenses and open source? The proprietary incumbents kept pulling ahead on features. But they also kept tightening the terms, raising the prices, and treating users as a revenue problem rather than a constituency. That created the conditions for Linux, for Firefox, and for everything that followed.

The same pressure is running now, and it’s running at two scales simultaneously. At the geopolitical level: Western chip export controls pushed China to accelerate its own open-weight development in part, because dependency on US infrastructure became untenable (this gave DeepSeek, Kimi, Minimax, Qwen and their underlying innovation an excuse to exist). Restriction forced innovation. At the individual level: the same logic now applies to any researcher or developer who finds themselves on the wrong side of Anthropic’s classifiers. If they’re going to sandbag your model, store your prompts, and move the goalposts on pricing, the rational response is to build the infrastructure that doesn’t require their permission. At least this is what has clicked for me and is giving me the motivation to explore new ways.

Whether the open-weight ecosystem can close the gap fast enough is the part I’m genuinely uncertain about, but what is clear is that local models are already being really useful for me. Gemma 4 and Qwen3 handle tasks that would have required GPT-4-class models two years ago. I really think that the capability distance is compressing for the kind of tasks that common mortals may want to perform. Will they get to Fable’s level? I don’t know, but at least I am happy that global access to a basic level of intelligence is getting there in an open (and affordable) form. I’ll talk about this in a future post, but I love the innovations introduced by Apple in their latest 20B on-device foundational model.

The positive note about the Fable release and all this controversy? Restriction accelerates the alternative. The GPU-poor problem got solved by the people it blocked. The open-source software problem got solved by the people the licences excluded. The AI independence problem will get solved by the people Anthropic’s classifiers are aimed at. That cycle has started. Fable just made it more urgent.

The inequality is here. This is the infrastructure fight of the next decade, and it’s one worth having. Join the fight!! :)

Until next week.

推荐订阅源

Hacker News - Newest: "AI"

Discussion about this post

Ready for more?