Stop telling your RAG bot not to hallucinate. Make it impossible.

The suggestion every RAG app ignores

If you've shipped a retrieval-augmented assistant, you've written some version of this line in your system prompt:

"If the answer isn't in the provided context, say you don't know. Do not make things up."

And you've watched the model cheerfully ignore it under pressure. A confident-sounding question comes in, retrieval returns something adjacent, and the model stitches together an answer that's plausible, fluent, and wrong. Telling a language model not to hallucinate is a suggestion — and suggestions lose to the model's overwhelming prior toward being helpful.

I got tired of fighting this with prompt wording, so I tried a different framing while building MCP SDK Docs Assistant, an assistant for the Model Context Protocol TypeScript SDK. The framing: don't ask the model to refuse — remove its ability to fabricate.

Refusal as code, not as a prompt

The core idea is that the model can only hallucinate if you hand it material to hallucinate from. So the refusal decision lives in the retrieval tool, before the model ever sees anything. If nothing clears a confidence bar, the tool returns an empty result set, and the model is left with no source text to spin into an answer.

In practice, the tool looks roughly like this:

const candidates = await hybridSearch(query, { version, limit: 12 });

if (!hasConfidentMatch(candidates)) {   // best cosine sim < 0.45
  return { relevant: false, results: [] };  // model has nothing to work with
}

const results = await rerank(query, candidates, 6);
return { relevant: true, results };

The model isn't asked to behave. The system is shaped so that the only coherent next move, when results come back empty, is to say "the docs don't cover this." Refusal stops being a personality trait you're hoping for and becomes a property of the architecture.

Why this particular SDK needed it

There's a second failure mode this assistant had to solve, and it's specific to fast-moving libraries. The MCP TypeScript SDK recently went through a v1 to v2 transition with breaking API changes — method renames, a transport swap, and a single package splitting into three. Generic documentation bots tend to blend v1 and v2 snippets together, so the code you copy out doesn't run.

The fix is to treat version as a first-class dimension of retrieval. Every chunk in the corpus is tagged with its version when it's ingested, retrieval filters by the version in scope, and answers label and split the two when both are relevant. Combined with the refusal gate, that gives three disciplines the assistant is built to hold: it stays version-correct, it cites the exact source line behind every claim, and it refuses when it should.

The retrieval pipeline, briefly

The path from a question to an answer runs through a small pipeline. A hybrid search fuses semantic similarity (vector cosine over pgvector) with lexical matching (Postgres full-text) using Reciprocal Rank Fusion, so you get the recall of embeddings without losing exact-term precision. The fused candidates pass through the refusal gate above, and survivors go through an LLM cross-encoder rerank to tighten the final set. Only then does the agent generate, citing as it goes.

The same retrieval core powers three surfaces from one codebase — a web chat, a CLI, and an MCP server — which means the assistant can publish itself as a tool that Cursor or Claude Desktop can call. An MCP tool that teaches the MCP SDK is a fun bit of recursion.

The part where the eval harness earned its keep

Here's the story I think is most useful to anyone building this kind of system, because it's about what happens after you think you're done.

I wrote an 18-case golden set that runs every question through the live agent and scores the behaviours that actually matter: refusal accuracy, whether answers carry citations, and version-correctness. An early run came back at 83%, and the failures were the instructive kind. Two genuinely answerable questions were being false-refused — the assistant said "the docs don't cover this" even though retrieval had surfaced strong hits, with top similarities around 0.70 to 0.72, comfortably above the 0.45 gate.

The retrieval layer had done its job. The problem was that the model was re-judging relevance itself, second-guessing a signal the tool had already cleared. The fix was to make refusal defer to the tool's signal rather than letting the model relitigate it. On the re-run, the suite came back at 100%, with refusal accuracy still intact — meaning the fix didn't buy answer coverage by making the system reckless.

That number deserves a caveat, and it's an important one: 100% is against this repo's own curated golden set. It proves the system holds its contract on a representative set of questions, not that it's omniscient. The real value of the harness isn't the score — it's that expanding the set is how future regressions get caught before users find them.

What I'd take away from this

If you build RAG systems, the transferable lesson isn't "use this threshold" or "use this stack." It's a shift in where you put your guarantees. Behavior you genuinely care about — refusing, citing, staying within a version — is more reliable when it's enforced in code than when it's requested in a prompt. A prompt is a hope. A gate in the retrieval tool is a constraint. And constraints survive contact with adversarial questions in a way that polite instructions never quite do.

It's open source — come build with it

The project is MIT-licensed and I'd love help on it. There's a friendly set of good first issue tickets — things like a dark theme, a copy-to-clipboard button on code blocks, persisting the selected engine, and expanding the eval golden set with more out-of-scope cases (which directly strengthens the refusal guarantees above).

If any of this is the kind of problem you enjoy, the repo, a live demo, and the contributor guide are all linked below. Issues and PRs welcome.

Repo: https://github.com/Kaydenletk/mcp-docs-assistant
Live demo: https://library-assisstant-ai.vercel.app

推荐订阅源

DEV Community