Built a 100k-Document RAG System by Hand. Hermes Read the Architecture in 47 Seconds.

This is a submission for the Hermes Agent Challenge

For the past six months, I have been building what Hermes does by hand.

Not as a thought experiment. Actually building it: a hybrid BM25 + vector search engine on Cloudflare Workers, a Gemma 4 MoE reflection layer, 100,000+ documents indexed, an MCP server with Durable Objects, multimodal image ingestion with Llama 4 Scout. I won the OpenClaw Challenge writing about spec-writing agents. I built bookmark-cli — a personal knowledge engine with 45,053 tweets indexed from years of reading.

When the Hermes Agent Challenge dropped, I had one question: does the tool do what I taught myself to do?

This is that experiment.

The Setup

My machine: Windows 11, Python 3.14, Git Bash (MSYS2), no WSL2.

The repo: vectorize-mcp-worker — my production hybrid RAG system. Six months of architectural decisions baked into TypeScript. V4 routing. Six specialized query routes. Cost analytics down to the millisecond. The kind of codebase where you know exactly what a competent agent should say about it.

The goal: Install Hermes, configure it to use Claude, have it summarise the architecture in 5 bullets, document every friction point.

Friction Point 1: The Installer Doesn't Know You Exist

The official install command:

curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

I ran it. The script loaded, detected my OS, and exited immediately:

Windows detected. Please use the PowerShell installer:
  irm https://raw.githubusercontent.com/.../install.ps1 | iex

That's not in the README. The PowerShell installer exists — it's right there in the repo — but nothing in the public docs points Windows users to it. You find out by reading the bash script's source.

The install UX assumes you're on Linux or macOS. Windows is supported, but you're expected to find your own way there.

Friction Point 2: WSL2 Would Have Saved Me This

I don't have WSL2 installed. That's on me. Every agent tooling project I've touched in the last year has quietly assumed it. Hermes is no different.

If you're on Windows and planning to work with CLI agents: install WSL2 first. Don't assume Git Bash is equivalent. It almost never is.

Friction Point 3: PyPI to the Rescue (Unofficially)

Rather than chase the PowerShell installer, I checked PyPI:

pip index versions hermes-agent
# Available versions: 0.13.0

hermes-agent 0.13.0 exists on PyPI. The official flow never mentions this. I installed it directly:

pip install hermes-agent

Four minutes, some progress bars, done.

Friction Point 4: Dependency Hell

The install succeeded but left me with this:

browser-use 0.12.6 requires openai==2.16.0, but you have openai 2.24.0
browser-use 0.12.6 requires requests==2.32.5, but you have requests 2.33.0
browser-use 0.12.6 requires rich==14.3.1, but you have rich 14.3.3

Hermes bumped openai from 2.16.0 to 2.24.0. Another tool I use (browser-use) pins the older version. Pip resolved it in Hermes's favor. Something will break later. I don't know what yet.

This is the tax you pay for a rich ecosystem. Every major agent framework is in a dependency arms race.

Friction Point 5: The Binary Isn't on PATH

After install, hermes doesn't work:

$ hermes doctor
bash: hermes: command not found

The executables land in %APPDATA%\Python\Python314\Scripts, which isn't on PATH by default. Pip warns you — in small print, at the end of a long install log.

Fixed with the full path:

/c/Users/DELL/AppData/Roaming/Python/Python314/Scripts/hermes.exe doctor

`hermes doctor` — Where It Gets Good

Once you find the binary, the experience shifts. hermes doctor:

◆ Python Environment
  ✓ Python 3.14.4
  ⚠ Not in virtual environment (recommended)

◆ Required Packages
  ✓ OpenAI SDK
  ✓ Rich (terminal UI)
  ✓ python-dotenv

◆ Tool Availability
  ✓ browser
  ✓ terminal
  ✓ file
  ✓ memory
  ⚠ browser-cdp (system dependency not met)
  ⚠ web (missing EXA_API_KEY, TAVILY_API_KEY...)

Every warning includes the fix. ✓ or ⚠, then the exact command. You don't go looking for the next step. After an install that gave you nothing, this lands differently.

Friction Point 6: `hermes model` Is Interactive-Only

I wanted to configure Claude from a script:

hermes model --provider anthropic --model claude-sonnet-4-5

No such flags. hermes model opens an interactive TUI. There's no way to set the model non-interactively from the CLI.

I wrote the API key directly to ~/.hermes/.env:

ANTHROPIC_API_KEY=sk-ant-...

Then passed the model inline at runtime:

hermes chat -m "anthropic/claude-sonnet-4-5" -q "..."

Friction Point 7: `--source` Doesn't Set the Working Directory

I tried:

hermes chat --source "C:/path/to/vectorize-mcp-worker/src" -q "Summarise..."

Hermes ignored the path and started from my home directory, listing unrelated projects:

I can see:
- bookmark-cli
- dannwaneri
- siteclinic

Which project should I analyze?

--source doesn't mean "work in this directory." I had to cd into the repo first. Small thing, confusing when you're scripting.

The Actual Session

Running from the repo directory with Claude as the model:

cd vectorize-mcp-worker
hermes chat \
  -m "anthropic/claude-sonnet-4-5" \
  -q "Summarise the architecture of this codebase in exactly 5 bullet points." \
  -Q

Output (verbatim, 47 seconds):

Cloudflare Workers-based RAG Platform — Serverless hybrid RAG system deployed on Cloudflare Workers with Vectorize (384d/1024d vector store), D1 (SQLite for full-text and metadata), Workers AI (embeddings via bge-small/bge-m3/qwen3-0.6b), service bindings to multimodal-pro-worker for image processing, and MCP Durable Objects for stateful agent sessions.

Intelligent V4 Routing Architecture — Query router (RouteSelector) uses Llama 3.2 3B intent classifier to dispatch to 6 specialized routes: ENTITY_LOOKUP (SQL), SEMANTIC_SEARCH (vector), KEYWORD_EXACT (BM25), OCR_DOCUMENT (text extraction), VISUAL_ANALYSIS (Llama 4 Scout vision), and GRAPH_REASONING (relationship traversal), with automatic fallback to vector search on route failure.

Hybrid Search with RRF Fusion — Dual-path retrieval combining vector search (Vectorize query) and BM25 keyword search (D1 full-text index), merged via Reciprocal Rank Fusion (RRF) with k=60, followed by optional cross-encoder reranking (bge-reranker-base), semantic highlighting, and Cloudflare Cache API for cacheable queries.

Chunking and Multi-Tenancy Pipeline — Ingestion engine performs recursive chunking with 15% overlap, parallel batch embedding (up to 100 docs/batch), deduplication via parent_id, and tenant isolation enforced by middleware that injects tenant_id filters into all search/delete operations.

Model Context Protocol Server — Exposes tools via modern MCP Streamable HTTP transport backed by VectorizeMcpAgent Durable Object at /mcp endpoint, with legacy JSON-RPC endpoints, cost analytics tracking (totalMs/embeddingMs/vectorMs/rerankerMs), rate limiting on mutating endpoints, and reflection system for synthetic training data generation.

I read that three times.

It got every major component right. RRF with k=60 — correct. Six routing modes, named accurately. The Durable Objects backing the MCP server. I've had reviewers describe the search as "just Vectorize" — missing the BM25 path entirely. Hermes got all of it in 47 seconds.

It missed the Gemma 4 MoE reflection layer specifically, and didn't distinguish between embedding model options by dimension size the way I would. Details inside details. The architecture read is accurate.

What I Actually Think

Hermes got the architecture right. Not close enough — right. And that's the thing that mattered most to me, because it's the hardest thing to fake. You can summarise README bullet points. You can't summarise a codebase you didn't actually read.

The install is rough if you're on Windows. Not broken — rough. The PyPI path works, the dependency conflicts are manageable, but "check the bash script source to find the PowerShell installer" is not an onboarding flow.

The gap I expected — between six months of building this by hand and a first session with a tool someone else built — was smaller than I thought it would be. That means either Hermes is further along than the install suggests, or I built something closer to commodity than I wanted to admit.

Probably both.

Multi-step sessions are what I actually want to test next — whether Hermes holds architectural context across a conversation the way I hold it across a codebase. That's the harder question. This was one query.

The Honest Take

If you're a Windows developer coming to Hermes cold: budget 30 minutes to fight the install, not 5. Once you're past it, the tool is real.

If you're a builder who's been assembling RAG pipelines by hand: Hermes won't replace what you know. It uses what you know. The architectural understanding you built making those systems is exactly what makes you dangerous with a tool like this.

The difference is now I can ask the codebase questions instead of just reading it.

That's not nothing.

All commands run on Windows 11, Python 3.14, Git Bash. Hermes Agent v0.13.0. Claude Sonnet 4.5 via Anthropic API.

vectorize-mcp-worker: hybrid RAG on Cloudflare Workers — github.com/dannwaneri/vectorize-mcp-worker

推荐订阅源

DEV Community

The Setup

Friction Point 1: The Installer Doesn't Know You Exist

Friction Point 2: WSL2 Would Have Saved Me This

Friction Point 3: PyPI to the Rescue (Unofficially)

Friction Point 4: Dependency Hell

Friction Point 5: The Binary Isn't on PATH

`hermes doctor` — Where It Gets Good

Friction Point 6: `hermes model` Is Interactive-Only

Friction Point 7: `--source` Doesn't Set the Working Directory

The Actual Session

What I Actually Think

The Honest Take

推荐订阅源

DEV Community

The Setup

Friction Point 1: The Installer Doesn't Know You Exist

Friction Point 2: WSL2 Would Have Saved Me This

Friction Point 3: PyPI to the Rescue (Unofficially)

Friction Point 4: Dependency Hell

Friction Point 5: The Binary Isn't on PATH

hermes doctor — Where It Gets Good

Friction Point 6: hermes model Is Interactive-Only

Friction Point 7: --source Doesn't Set the Working Directory

The Actual Session

What I Actually Think

The Honest Take

`hermes doctor` — Where It Gets Good

Friction Point 6: `hermes model` Is Interactive-Only

Friction Point 7: `--source` Doesn't Set the Working Directory