VibeGame: Exploring Vibe Coding Games

Hugging Face - Blog

Waypoint-1.5: Higher-Fidelity Interactive Worlds for Everyday GPUs ALTK‑Evolve: On‑the‑Job Learning for AI Agents Safetensors is Joining the PyTorch Foundation Holo3: Breaking the Computer Use Frontier Any Custom Frontend with Gradio's Backend A New Framework for Evaluating Voice Agents (EVA) Bringing Robotics AI to Embedded Platforms: Dataset Recording, VLA Fine‑Tuning, and On‑Device Optimizations One-Shot Any Web App with Gradio's gr.HTML CUGA on Hugging Face: Democratizing Configurable AI Agents New in llama.cpp: Model Management Building Deep Research: How we Achieved State of the Art OVHcloud on Hugging Face Inference Providers 🔥 20x Faster TRL Fine-tuning with RapidFire AI Building for an Open Future - our new partnership with Google Cloud Aligning to What? Rethinking Agent Generalization in MiniMax M2 Building a Healthcare Robot from Simulation to Deployment with NVIDIA Isaac Sentence Transformers is joining Hugging Face! Unlock the power of images with AI Sheets Supercharge your OCR Pipelines with Open Models Google Cloud C4 Brings a 70% TCO improvement on GPT OSS with Intel and Hugging Face Get your VLM running in 3 simple steps on Intel CPUs Nemotron-Personas-India: Synthesized Data for Sovereign AI Introducing RTEB: A New Standard for Retrieval Evaluation Accelerating Qwen3-8B Agent on Intel® Core™ Ultra with Depth-Pruned Draft Models Nemotron-Personas-Japan: ソブリン AI のための合成データセット Swift Transformers Reaches 1.0 – and Looks to the Future Smol2Operator: Post-Training GUI Agents for Computer Use SyGra: The One-Stop Framework for Building Data for LLMs and SLMs Gaia2 and ARE: Empowering the community to study agents Scaleway on Hugging Face Inference Providers 🔥 Democratizing AI Safety with RiskRubric.ai Public AI on Hugging Face Inference Providers 🔥 `LeRobotDataset:v3.0`: Bringing large-scale datasets to `lerobot` Visible Watermarking with Gradio Introducing the Palmyra-mini family: Powerful, lightweight, and ready to reason! Tricks from OpenAI gpt-oss YOU 🫵 can use with transformers Fine-tune Any LLM from the Hugging Face Hub with Together AI Jupyter Agents: training LLMs to reason with notebooks mmBERT: ModernBERT goes Multilingual Welcome EmbeddingGemma, Google's new efficient embedding model SAIR: Accelerating Pharma R&D with AI-Powered Structural Intelligence Make your ZeroGPU Spaces go brrr with ahead-of-time compilation NVIDIA Releases 6 Million Multi-Lingual Reasoning Dataset Generate Images with Claude and Hugging Face From Zero to GPU: A Guide to Building and Scaling Production-Ready CUDA Kernels MCP for Research: How to Connect AI to Research Tools Kimina-Prover-RL Arm & ExecuTorch 0.7: Bringing Generative AI to the masses Neural Super Sampling is here! TextQuests: How Good are LLMs at Text-Based Video Games? 🇵🇭 FilBench - Can LLMs Understand and Generate Filipino? Introducing AI Sheets: a tool to work with datasets using open AI models! Accelerate ND-Parallel: A guide to Efficient Multi-GPU Training Vision Language Model Alignment in TRL ⚡️ Welcome GPT OSS, the new open-source model family from OpenAI! Measuring Open-Source Llama Nemotron Models on DeepResearch Bench 📚 3LM: A Benchmark for Arabic LLMs in STEM and Code Implementing MCP Servers in Python: An AI Shopping Assistant with Gradio Introducing Trackio: A Lightweight Experiment Tracking Library from Hugging Face Say hello to `hf`: a faster, friendlier Hugging Face CLI ✨ Parquet Content-Defined Chunking TimeScope: How Long Can Your Video Large Multimodal Model Go? Fast LoRA inference for Flux with Diffusers and PEFT Accelerate a World of LLMs on Hugging Face with NVIDIA NIM Arc Virtual Cell Challenge: A Primer Consilium: When Multiple LLMs Collaborate Back to The Future: Evaluating AI Agents on Predicting Future Events Five Big Improvements to Gradio MCP Servers Ettin Suite: SoTA Paired Encoders and Decoders Migrating the Hub from Git LFS to Xet Kimina-Prover: Applying Test-time RL Search on Large Formal Reasoning Models Asynchronous Robot Inference: Decoupling Action Prediction and Execution ScreenEnv: Deploy your full stack Desktop Agent Building the Hugging Face MCP Server Reachy Mini - The Open-Source Robot for Today's and Tomorrow's AI Builders Creating custom kernels for the AMD MI300 Upskill your LLMs With Gradio MCP Servers SmolLM3: smol, multilingual, long-context reasoner Three Mighty Alerts Supporting Hugging Face’s Production Infrastructure Efficient MultiModal Data Pipeline Announcing NeurIPS 2025 E2LM Competition: Early Training Evaluation of Language Models Training and Finetuning Sparse Embedding Models with Sentence Transformers Welcome the NVIDIA Llama Nemotron Nano VLM to Hugging Face Hub Gemma 3n fully available in the open-source ecosystem! Transformers backend integration in SGLang (LoRA) Fine-Tuning FLUX.1-dev on Consumer Hardware Groq on Hugging Face Inference Providers 🔥 How Long Prompts Block Other Requests - Optimizing LLM Performance Learn the Hugging Face Kernel Hub in 5 Minutes Featherless AI on Hugging Face Inference Providers 🔥 Convert Transformers to ONNX with Hugging Face Optimum Intel and Hugging Face Partner to Democratize Machine Learning Hardware Acceleration Director of Machine Learning Insights [Part 3: Finance Edition] The Annotated Diffusion Model Deep Q-Learning with Space Invaders Graphcore and Hugging Face Launch New Lineup of IPU-Ready Transformers Introducing Pull Requests and Discussions 🥳 Efficient Table Pre-training without Real Data: An Introduction to TAPEX An Introduction to Q-Learning Part 2/2 How Sempre Health is leveraging the Expert Acceleration Program to accelerate their ML roadmap

Dylan Ebert · 2025-09-29 · via Hugging Face - Blog

Back to Articles

The Problem
What Is "Vibe Coding"?
Context Management
Initial Exploration
Attempt 1: Roblox MCP
Attempt 2: Unity MCP
Attempt 3: Web Stack
Comparison Summary
The Solution: VibeGame
Design Philosophy
So Does It Actually Work?
Try It Yourself
What's Next?
The Problem

People are trying to vibe code games. And it kind of works, at first. However, as the project grows, things begin to fall apart. Why? And what can we do about it?

I'll talk about the problem, how I fixed it, and where to go from here.

What Is "Vibe Coding"?

First, what is vibe coding? It's originally coined by Andrej Karpathy in a viral tweet where it's defined as where you "fully give in to the vibes, embrace exponentials and forget the code even exists".

However, since then, it's used descriptively to mean a lot of different things, anywhere from just "using AI when coding" to "not thinking about the code at all". In this blog post, I'll define it as: using AI as a high-level programming language to build something. Like other programming languages, this benefits from understanding what's going on under the hood, but doesn't necessarily require it.

With this interpretation, you could make a game without understanding code, though knowing the fundamentals still helps.

Context Management

Earlier I mentioned that "as the project grows, things begin to fall apart". This is because there is evidence that as the context window fills up, model performance begins to degrade. This is especially true for game development, where the context can grow very large, very quickly.

To address this issue, there are many personal ad-hoc solutions, such as writing LLM-specific context directly in the project files, or more comprehensive solutions like Claude Code Development Kit for large-scale context management.

I couldn't find a lightweight, accessible solution, which doesn't rely on significant domain knowledge. So I made one: 🧅 Shallot, a simple, lightweight, unopinionated context management system for Claude Code. It relies on two basic commands:

/peel [prompt] to load context at the beginning of a conversation
/nourish to update context at the end of a conversation

Anecdotally, this works well. However, it works best when the project stays lean and well-organized, so all relevant context can easily fit in the model's context window. While Claude Code is used here, the same principles generalize to other models.

Beyond context management tools, platform choice is critical. The platform should ideally naturally keep projects lean through high-level abstractions, while also being something AI models understand well. So, what existing platforms are best suited for vibe coding?

Initial Exploration

I initially tried 3 different approaches to vibe coding games: Roblox MCP, Unity MCP, and web. For each, I tried to build a simple incremental game inspired by Grass Cutting Incremental, using Claude Code for each.

Here's how it went:

Attempt 1: Roblox MCP

The official MCP server from Roblox. This allows AI to interact with Roblox Studio by sending commands to run code.

Pros:

Excellent level of abstraction with built-in game mechanics
AI could very easily understand the syntax and convert instructions to code

Cons:

No files, only using code to read data, which severely limits context management
Very limited runtime information for AI to work with
Proprietary walled garden

Roblox provides an excellent layer of abstraction for keeping the codebase lean and manageable, which is perfect for vibe coding. However, the walled garden and lack of context makes it infeasible for vibe coding, unless it's in-house at Roblox.

Attempt 2: Unity MCP

The unofficial MCP server for Unity. This allows AI to interact with the Unity Editor: reading the console, managing assets, and validating scripts.

Pros:

Full file system access

Cons:

There are many ways to do everything in Unity, changing frequently across versions, causing AI to get confused
Requires significant domain knowledge to tell the AI how to do things, rather than what to do
AI performance was inconsistent and unreliable
Proprietary engine (though much more transparent than Roblox)

Unity is a powerful engine with a lot of capabilities. However, the complexity and variability of the engine makes it difficult for AI to consistently produce good results without significant user domain knowledge.

Attempt 3: Web Stack

The open web platform, using three.js for 3D rendering, rapier for physics, and bitecs for game logic.

Pros:

Far superior AI proficiency compared to game engines, likely due to massive training data
Full file system access
Fully open source stack with complete control/transparency

Cons:

Relatively low level libraries, requiring essentially building the engine before building the game
Lack of ecosystem for high-quality 3D games; web tends toward 2D games and simple 3D experiences

This approach had the best AI performance by far, likely due to the vast amount of web development data available during training. However, the low-level nature of the libraries meant that I had to essentially build a game engine before I could build the game itself. This allows us to work at a much higher level of abstraction, like we did with Roblox.

Although it required building an engine first, this approach was the only one that produced a fun result without heavy domain knowledge.

Comparison Summary

Platform	AI Performance	Abstraction Level	Context Management	Open Source
Roblox	⭐⭐⭐	⭐⭐⭐⭐⭐	⭐	❌
Unity	⭐	⭐⭐	⭐⭐⭐	❌
Web	⭐⭐⭐⭐⭐	⭐	⭐⭐⭐⭐⭐	✅

The Solution: VibeGame

After these experiments, I had a clear picture: the web stack had excellent AI performance but was too low-level, while Roblox had perfect abstraction but lacked openness and context management.

So, what about combining the best of both?

Introducing VibeGame, a high-level declarative game engine built on top of three.js, rapier, and bitecs, designed specifically for AI-assisted game development.

Design Philosophy

There were three key decisions that went into the design of VibeGame:

Abstraction: A high-level abstraction with built-in features like physics, rendering, and common game mechanics, keeping the codebase lean and manageable. This takes inspiration from popular high-level sandbox games/game "engines" like Roblox, Fortnite UEFN, and Minecraft.
Syntax: A declarative XML-like syntax for defining game objects and their properties, making it easy for AI to understand and generate code. This is similar to HTML/CSS, which AI models are already proficient in.
Architecture: An Entity-Component-System (ECS) architecture for scalability and flexibility. ECS separates data (components) from behavior (systems), encouraging the project to stay modular and organized as it grows, conducive to vibe coding and context management.

A basic game looks like this:

<world canvas="#game-canvas" sky="#87ceeb">
  <!-- Ground -->
  <static-part pos="0 -0.5 0" shape="box" size="20 1 20" color="#90ee90"></static-part>

  <!-- Ball -->
  <dynamic-part pos="-2 4 -3" shape="sphere" size="1" color="#ff4500"></dynamic-part>
</world>

<canvas id="game-canvas"></canvas>

<script type="module">
  import * as GAME from 'vibegame';
  GAME.run();
</script>

See it in action in this JSFiddle or the Live Demo.

This will create a simple scene with a ground plane and a falling ball. The player, camera, and lighting are created automatically. All of this is modular and can be replaced. Arbitrary custom components and systems can be added as needed.

This comes bundled with an llms.txt file containing documentation about the engine, designed specifically for AI, to be included in its system prompt or initial context.

So Does It Actually Work?

Yes.

Well, kind of.

Here's the game I built to test building a simple incremental grass collection game using VibeGame and Claude Code. It worked very well, requiring minimal domain knowledge for implementing the core game mechanics.

However, there are still some major caveats:

It works well for building what the game engine supports, i.e. a simple platformer or game that only relies on basic physics and rendering.
However, it struggles with anything more complex that isn't yet implemented in the engine, like interaction, inventory, multiplayer, combat, etc.

So, with a definition of vibe coding that is the one-shot "make me a game" approach, it doesn't work. However, with the definition of treating vibe coding like a high-level programming language, it works very well, but requires users to understand the engine's capabilities and limitations.

Try It Yourself

To try it immediately, I built a demo where you can develop a game directly in the browser using VibeGame with Qwen3-Next-80B-A3B-Instruct: Live Demo on Hugging Face.

You can also test it locally with a frontier model like Claude Code:

npm create vibegame@latest my-game
cd my-game
npm run dev  # or bun dev

Then, paste all the contents of the included llms.txt to CLAUDE.md, providing full documentation about the engine for the AI to reference (or point your own context management system to it). This works with other models as well.

What's Next?

The engine is currently very barebones and only supports very basic mechanics (unless writing it from scratch). However, initial results are promising.

Next steps would be:

Flesh out the engine with more built-in mechanics, getting closer to par with early versions of Roblox or UEFN. This includes:

Interaction
Inventory/items
Multiplayer
Skinned meshes/animations with curated database
Audio with curated database

Improve the AI guidance systems, providing beginners with a better experience. This includes:

Clear messaging about engine capabilities/limitations
Guided prompts for common tasks
Many more examples and templates
Educational resources

It's also worth exploring how vibe coding games could harness more proven engines. For example, building a high-level sandbox game editor on top of Unity or Unreal Engine (similar to how Unreal Editor for Fortnite is built on Unreal Engine) could provide a more controlled environment for AI to work with, while leveraging the power of established engines.

We're also likely to see more in-house solutions from major players.

Follow me to keep up with what's going on in the space!

Links:

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。

推荐订阅源

Hugging Face - Blog

The Problem What Is "Vibe Coding"? Context Management Initial Exploration Attempt 1: Roblox MCP Attempt 2: Unity MCP Attempt 3: Web Stack Comparison Summary The Solution: VibeGame Design Philosophy So Does It Actually Work? Try It Yourself What's Next? The Problem

What Is "Vibe Coding"?

Context Management

Initial Exploration

Attempt 1: Roblox MCP

Attempt 2: Unity MCP

Attempt 3: Web Stack

Comparison Summary

The Solution: VibeGame

Design Philosophy

So Does It Actually Work?

Try It Yourself

What's Next?

The Problem
What Is "Vibe Coding"?
Context Management
Initial Exploration
Attempt 1: Roblox MCP
Attempt 2: Unity MCP
Attempt 3: Web Stack
Comparison Summary
The Solution: VibeGame
Design Philosophy
So Does It Actually Work?
Try It Yourself
What's Next?
The Problem