Why I Built Mneme HQ: Preventing AI Agent Architectural Drift

Originally published on theovalmis.com.

Every time you start a new session with an AI coding agent, it has forgotten everything. Not just the small things — the names, the syntax, the last error message. It has forgotten the decisions you made three weeks ago about why you chose Postgres over MongoDB. It has forgotten the auth pattern you locked in after a security review. It has forgotten that you explicitly ruled out a particular library because it caused problems in production.

The model starts cold. You start explaining. Again.

"The problem with AI coding agents isn't intelligence. It's memory. Every session is the first session."

I spent months working on a long-running project with AI assistants — Cursor, Claude Code, GPT-4 — and I noticed the same pattern repeating. The models were brilliant at individual tasks. But they had no continuity. No awareness of the constraints we had already resolved. No sense that this codebase had a history, a set of deliberate architectural choices, things that were decided and should stay decided.

The Drift Problem

I started calling it architectural drift. It happens gradually. The agent suggests a new dependency — one you already evaluated and rejected for good reason. You either catch it (friction, lost time) or you don't (technical debt, inconsistency). The agent reaches for a different database adapter because it's seen it in training more often than yours. The agent refactors a module using a pattern you explicitly moved away from six months ago.

None of this is the model's fault. These systems don't have access to your decision history. They can't know what you know. They see the code in front of them, not the conversation that led to it.

The standard advice is: put everything in your system prompt. Write a long CLAUDE.md. Add comments everywhere. Re-explain at the start of every session.

That's not a solution. That's manual memory management for a tool that's supposed to reduce cognitive load.

What I Wanted

Store decisions where the code lives. Not in a separate wiki. Not in a Notion database that nobody checks. Right in the repository, version-controlled, co-located with the work they govern.

Inject those decisions automatically. Not by copying and pasting into every new chat. By having the tool surface the right constraints at the right moment — before the model has a chance to drift.

Enforce, not just remind. A decision that can be ignored isn't a decision. It's a suggestion. I wanted something that could validate whether the current state of the codebase actually respects the decisions we've recorded — and flag the ones that don't.

Building Mneme HQ

I built Mneme HQ to solve this. The name comes from the Greek goddess of memory and remembrance — one of the original Muses. The idea is simple: your project has memory. Your AI assistant should too.

At its core, Mneme HQ stores decisions in structured files that travel with your codebase:

$ mneme add "Use Postgres — no new databases without ADR"
Decision recorded. ID: ADR-001

Those decisions get committed with your code. They're in git. They're reviewable. They're auditable. When you start a new session, Mneme HQ injects the relevant decisions into context — not all of them, just the ones that matter for what you're working on now.

And before you ship, you can run a pre-flight check:

$ mneme check --mode strict
Checking decisions against current context...

PASS  Storage decision enforced — Postgres locked, no new DBs
PASS  Auth pattern respected — JWT middleware unchanged
WARN  New dependency introduced — prisma not in approved list
FAIL  Violates ADR-004 — Repository pattern bypassed in user.service.ts

2 passed · 1 warning · 1 failure

PASS. WARN. FAIL. Clear signals you can act on before the model's suggestion becomes production code.

Cursor and Claude Code Integration

Because most AI-assisted development happens inside editors, Mneme HQ can generate Cursor rules directly from your stored decisions:

$ mneme cursor generate
Generated .cursor/rules/mneme.mdc from 7 decisions

Those rules become editor-level guardrails. Every suggestion Cursor or Claude Code makes inside your project gets filtered through the constraints you've already established. You stop explaining. The model already knows.

Why Now

AI coding agents are getting better at an extraordinary rate. But the gap between what they can do in a single session and what they can maintain across a long-running project is growing, not shrinking. Better models don't fix the memory problem. Longer context windows help at the margins but don't solve continuity.

The projects that will get the most out of AI-assisted development aren't the ones with the best prompts. They're the ones with the best decision infrastructure — the ones where the codebase itself carries the memory of its own architectural intent.

That's what Mneme HQ is for. Open source, repo-native, and CI-ready. If you're using Cursor, Claude Code, or any other AI coding tool on a project that has history — decisions made, patterns locked in, constraints established — I think you'll find it useful.

Try it: mnemehq.com · GitHub

Originally published on theovalmis.com. I write about AI governance, architectural drift, and the systems that make AI-assisted development sustainable at scale.

推荐订阅源