Three years ago it seemed like every demo had one. Inworld and Convai were the talk of GDC. Joon Sung Park's AI Village paper described an entire village of NPC agents coordinating together. Altera even showed off AI NPCs in Minecraft that could collaborate and form their own governments.
Now it's 2026. Name a game that you play because of its "AI NPCs".
You can't, not really.
So...what happened? Was it overhyped? Is the tech simply not there?
After almost two years building in this space, I'd like to give my 3 cents:
1The unit economics just don't make sense
We'll start with the most obvious reason (any dev going into this arena will have immediately stepped on this rake) - there's a negative incentive with the cost of inference. Put simply:
the more the player chats → the more the dev has to pay
So devs have to ask themselves - "is it really worth it to add an AI NPC to my game?" If the answer's no, then one might as well just make a traditional game. After all, handwritten dialogue trees are cheaper, more controllable, easier to QA, and often better at delivering the intended experience.
What AI does well is increased engagement and retention through AI companion chat - Character AI proved that back in 2023. But games are already engineered to hold attention. Slapping an AI NPC on top of an existing game only makes it marginally more immersive. If it doesn't increase retention or revenue, the math doesn't work.
This isn't to say there aren't solutions. Here are two potential paths:
-
Design a game from the ground up around AI chat.
This could range from a dating sim/otome game to an immersive RPG. The important thing is that while inference costs are still a factor, the bet is that player engagement is directly tied to the AI companion experience - it's essentially a new kind of game category. Here I want to shoutout Fai Nur's Status, (recently raised $17M) which gamifies the thrill of growing a Twitter account using AI companion followers - it's an entirely new kind of game, only possible with the advent of LLM chat.
-
Use local LLMs to negate the cost of inference.
The most interesting startup here is Piero Molino's Studio Atelico - their first move was getting the famously expensive AI Village simulation running locally on device. The trade-off is a slightly slower, slightly dumber model. But once this optimization is solved, the negative economic incentive flips on its head.
Even with the math fixed, though, you still have to ask whether AI NPCs actually make the game more fun. Which brings us to the next problem.
2Just because it "works", doesn't mean it's actually fun
Is it possible to define an eval for "play" or "fun"?
Feels like a question for philosophers, not developers. It's no surprise that the field has been focused on questions that are actually measurable today:
- Can we make an AI that follows in-game rules?
- Can we make it respond to commands accurately?
- Can we make it win?
Minecraft was one of the hottest arenas for this. In 2023, the benchmark everyone optimized against was Voyager. Nvidia researchers put a GPT-4 agent in Minecraft, gave it a code-writing loop and a growing skill library, and let it grind through the tech tree till it could mine diamonds.
While technically impressive, it was far from something anyone could actually "play" with.
We tried it ourselves. The agent would stand still, search for nearby blocks, log activity in chat, then every few minutes teleport around and perform a burst of actions. It did not really react to the world. It did not meaningfully react to the player. It felt less like a character and more like an algorithm wearing an avatar.
More recent incarnations behave more companion-like. Mindcraft by Emergent Garden lets players chat, give orders, and even co-build (when it works) - and you might have run into the hundreds of viral AI Minecraft videos on YouTube showcasing these abilities (all of which are dramatized to make the AI seem way more capable and smooth than it is right now).
Last year we made a prototype where we integrated our own Oto companions with Mindcraft (Oto takes care of the personality and voice, Mindcraft took care of in-game actions). As a long-time Minecraft player I can say the experience felt shockingly good. Even playtesters reacted incredibly positively...at least initially.
The bot could chat, it could see, it could assist. But it couldn't grow with you. It couldn't laugh with you or be surprised alongside you. The initial novelty and delight of the playtesters gradually waned into "what else can it do?" And users weren't interested in coming back.
In the end it was still a glorified NPC - revealing a much deeper-seated issue.
3AI is still too uncanny
This might be the most damning problem.
Despite the progress in LLMs, tool calling, voice, memory, and multimodal perception (not to mention advanced avatars, procedural animations, and complex facial expressions), AI still feels like AI.
In some ways, today's models are too smart for games.
When you talk to an AI NPC and it knows too much, the illusion breaks immediately. A medieval peasant who can explain quantum mechanics, a shopkeeper who speaks like a life coach, a companion who calmly answers every question with encyclopedic confidence - these characters do not feel alive. They feel like ChatGPT in costume.
Even with strong personality prompting, the illusion is fragile. Players can push on the edges. They can jailbreak the character. They can ask weird questions. They can drag the NPC out of the fantasy. We have already seen this happen publicly with AI characters in mainstream games, like the AI Vader Fortnite incident.
But the uncanniness goes deeper than knowledge boundaries. In games, characters need to feel situated. They need to have their own motives, rhythms, moods, and limitations. They need to interrupt you sometimes. They need to notice things before you point them out. They need to get bored, excited, annoyed, confused, scared, or delighted.
Most AI agents do not do that.
Instead, they default to being helpful. And helpfulness is not the same thing as personhood.
AI also struggles with novelty. It does not genuinely find things surprising. It does not laugh because something unexpected happened. It does not build emotional texture around a shared moment. Instead, it often collapses into one of two modes: know-it-all or therapist.
Worst of all, AI characters do not really grow. They may have memory from summarizing prior conversations and they may even recall facts about the player. But they do not become different as a result of what happened. Their intelligence does not evolve. Their personality rarely changes in a believable way. Their relationship with the player often feels stored rather than lived.
That is fatal for games, because games are constantly generating new situations. The player changes. The world changes. The stakes change. If the NPC does not change too, it eventually starts to feel hollow.
These problems won't solve themselves
My prediction is that AI NPCs will actually become more uncanny before they become less.
That sounds strange given the pace of AI progress, but just look at the incentives - they're all pointing in the wrong direction.
The big foundation model labs are being pulled toward B2B "call center" use cases. They are optimizing for assistants, enterprise workflows, coding, research, support, and productivity. These models will keep getting more knowledgeable, more compliant, more polished, and more generally useful.
But a great AI NPC does not need to be a perfect assistant.
In fact, it probably should not be.
Games need AI that can:
- Interrupt the player ("Watch out!")
- Be proactive (prioritize its own goals, sometimes to the player's detriment)
- Stay on for hours (even 3 hour sessions are rookie numbers for gamers)
- Banter and push back (not just be a sycophant)
- Emote - laugh, cry, scream (watch any streamer these days - they're loud af)
- Know when to talk and when to shut up
- Coordinate with multiple players, systems, and agents simultaneously
No foundation model lab is treating this as the core problem right now (DM me if I'm wrong). Part of the issue is that there has not yet been a breakout AI-native game to prove the category. Another part is that we do not really know how to measure progress here. There is no clean benchmark for "this character feels alive." No eval for "the player formed an attachment." No leaderboard for "this NPC made the world feel more real."
So the space is stuck between impressive demos and sustainable products.
The breakthrough probably will not come from someone optimizing a benchmark, and it will not simply arrive as a side effect of the next foundation model release. To solve these fundamental problems for AI NPCs we need to rethink the stack from first principles.
If you're interested in hard problems too, let's have a chat.


























