GitHub - agrin96/VibegraphGenerator: Proof of concept for using LLMs to generate an application by following a recursive graph decomposition of the application's architecture.

Graph-based Code Generation Exploration

In this repo I explore the idea of generating software implementation via a structured graph decomposition. The system consists of 3 phases. The planner the plangraph and the codegenerator. The planner is a simple LLM loop to gather user requirements by prompting the user with clarifying questions until the model judges that it has adequate information about the user's intents.

In the plangraph phase, the user's plan is an input to a graph builder. Each graph is a recursive decomposition of the plan into individual self-contained components. A component can have inline responsibilities and delegated responsibilities with the latter determining whether a component is a leaf or an intermediate node in the graph.

The theory was that we can use a graph structure to better cement the LLM code generation and allow it to focus on one system at a time without requiring a "whole-project" context. The plangraph is an alternative to the typical LLM workflow of markdown planning documents which are prompt-injected into an LLM to generate the desired outcome.

Results

The plangraph module actually worked reasonably well despite my initial misgivings. The agent is able to recursively decompose problems in what seems like a reasonable manner in my tests. The repo includes an example for a simple tui calculator app plangraph. This is generated from an example plan defined in main.py.

For the actual code generation - the coder agent and orchestrator agent both work pretty well at their assigned tasks. In particular I was rather impressed by how well the coder could generate functional components from a plangraph node description. It really reinforced my thesis that structured guardrails and precise deterministic tooling is the right way to utilize agents. That being said - it does sometimes get stuck on outdated syntax and test-failing loops, though this can be chalked up to a rather haphazard implementation of a web search tool. In my tests I found that precise documentation lookup is extremely important to a successful generation, rather similarly to how humans spend time googling.

The orchestrator does its job though there isn't much fanfare with it. Its a good way to guide the graph generation process in a semi-deterministic way. It also facilitates regeneration and user feedback which is needed for a generative workflow.

The scaffolding around my code generation is also rather raw and relies on a global pyproject structure - but this isn't meant to be anything more than a research/learning experience.

Conclusion

The idea I had is "proven" in a sense - but there are some serious drawbacks. The first is that my iterative generation approach requires a lot of tokens. A single pass for the 5 component calculator app runs for over an hour. I am sure a lot of this is due to LLM latency and my usage of cheap models - but it is both time and token inefficient to run this type of decomposition.

Whether the code is that much better than a dedicated claude-code/codex session can generate, I don't know, probably not. But it could potentially be more maintainable over the long term because of the plangraph structure which was the real idea/star of the show.

推荐订阅源

Hacker News: Show HN

Graph-based Code Generation Exploration

Results

Conclusion