Hermes Memory Providers: A Complete Breakdown for New Users

Hermes has a lot of memory options. If you're new, the choices can be overwhelming — built-in memory, 8 external providers, different costs, different architectures. This guide breaks it all down so you can make the right call for your setup.

First: Built-In Memory (Always Active)

Before we talk providers, understand that built-in memory is always on. It doesn't cost anything, requires no setup, and works out of the box.

Two files in ~/.hermes/memories/:

File	Purpose	Char Limit
MEMORY.md	Agent's notes — environment facts, project conventions, lessons learned	2,200 chars (~800 tokens)
USER.md	User profile — your name, preferences, communication style	1,375 chars (~500 tokens)

Both are injected into the system prompt at the start of every session. The agent manages them automatically — it saves preferences you correct, environment facts it discovers, and conventions it learns.

Key details:

Entries are separated by § delimiters
The header shows usage % (e.g. MEMORY [67% — 1,474/2,200 chars])
Above 80% capacity, the agent should consolidate before adding
Duplicate entries are auto-rejected
Entries are scanned for injection/exfiltration patterns for security
Changes persist to disk immediately but appear in the system prompt at the next session (frozen snapshot — preserves LLM prefix cache)

For most new users, built-in memory is enough. It handles preferences, project facts, and daily workflow notes. You don't need an external provider for a personal assistant setup.

But you'll want one when:

You have multiple Hermes profiles that should share knowledge
You want the agent to learn and synthesize across sessions automatically
You're running long conversations that exceed context limits
You need structured knowledge retrieval (entities, relationships, not just text blobs)

The 8 External Memory Providers

All external providers are installed via:

hermes memory setup      # interactive picker
hermes memory status     # check what's active
hermes memory off        # disable

Or set manually in ~/.hermes/config.yaml:

memory:
  provider: hindsight    # or any of the 8

Important: Only one external provider can be active at a time. All of them layer on top of built-in memory — they don't replace it.

Quick Comparison

Provider	Storage	Cost	Unique Angle	Best For
Hindsight	Local/Cloud	Free (local)	Knowledge graph + reflect synthesis	Highest accuracy, privacy
Holographic	Local SQLite	Free	HRR algebra + trust scoring, zero deps	Air-gapped, zero-install
OpenViking	Self-hosted	Free (AGPL)	Tiered L0/L1/L2 loading, 80-90% token savings	Self-hosted teams, cost optimization
Mem0	Cloud	Freemium	Server-side LLM extraction, dual memory scope	Fastest setup
Honcho	Cloud/Self	Paid (cloud) / Free (self-hosted)	Dialectic user modeling	Multi-agent, deep user understanding
ByteRover	Local/Cloud	Freemium	Knowledge tree in human-readable Markdown	Pre-compression knowledge capture
RetainDB	Cloud	Paid	Hybrid search: vector + BM25 + reranking	Production search quality
SuperMemory	Cloud	—	Web-focused memory with browser integration	Web research workflows

Benchmark Snapshot

Only two providers have published LongMemEval scores:

Provider	Score	Model
Hindsight	91.4%	Gemini-3
Hindsight	89.0%	Open-source 120B
Mem0	67.6%	GPT-4o (LongMemEval-S variant)

Hindsight is the clear retrieval accuracy leader. Others haven't published comparable benchmarks.

Provider Deep Dives

🥇 Hindsight

The best all-around choice for most users who want local + accurate.

Stores structured knowledge — discrete facts, named entities, and relationships — not raw text chunks. Its unique hindsight_reflect tool periodically synthesizes higher-level insights across all memories. Think of it as the agent building a personal knowledge graph over time.

Setup:  hermes memory setup → select Hindsight
        Leave blank for local daemon, or set HINDSIGHT_API_KEY for cloud
Tools:  hindsight_recall, hindsight_retain, hindsight_reflect
Cost:   Free (local PostgreSQL daemon) / Cloud available for teams

Best if: You want the highest retrieval accuracy, need structured knowledge, or handle privacy-sensitive data.

Holographic

Zero dependencies. Nothing leaves your machine. Literally two tools and done.

Uses Holographic Reduced Representations (HRR) — memories stored as superposed complex-valued vectors. Recall is algebraic, not similarity-based. A trust-scoring mechanism causes confirmed memories to gain weight and contradicted ones to decay over time.

Setup:  hermes memory setup → select Holographic. That's it. No API keys.
Tools:  2 tools (minimal by design)
Cost:   Free. Local SQLite. Period.

Best if: You're in an air-gapped environment, hate external dependencies, or want self-correcting memory that learns what's trustable.

OpenViking

The token-saver. Self-hosted context database from ByteDance.

Its filesystem-style hierarchy with tiered loading is the standout feature:

L0 (Abstract): ~100 tokens — loaded every turn
L1 (Overview): ~2k tokens — loaded when planning
L2 (Full): Complete content — loaded only when deep context needed

This means 80-90% token cost reduction vs. loading full context every turn. Auto-extracts memories into 6 categories: profile, preferences, entities, events, cases, patterns.

Setup:  pip install openviking
        openviking-server
        hermes memory setup → select OpenViking
        Set OPENVIKING_ENDPOINT=http://localhost:1933
Tools:  viking_search, viking_read, viking_browse, viking_remember, viking_add_resource
Cost:   Free (AGPL-3.0, self-hosted)

Best if: You're running at scale, want self-hosted infrastructure, or need to minimize token costs.

Mem0

The "just make it work" option. 30 seconds to running.

Server-side LLM extraction means Mem0's infrastructure decides what to keep. Includes a circuit breaker so memory failures don't block agent responses. Dual memory scope (session + user) means it separates short-term context from long-term facts.

Setup:  hermes memory setup → select Mem0
        Set MEM0_API_KEY=your-key
Tools:  mem0_add, mem0_search, mem0_get_all
Cost:   Freemium (free tier available)

Best if: You want the fastest setup, don't want to self-host, and are okay with cloud storage. Good starting point — you can always migrate later.

Honcho

The philosopher. Builds a model of how you think, not just what you know.

Dialectic user modeling captures reasoning patterns, communication style, and decision-making tendencies over time. Two-layer context injection with configurable cadences for refreshes. Supports multi-agent setups with separate AI peers per Hermes profile.

Setup:  hermes memory setup → select Honcho
        Set HONCHO_API_KEY=your-key
Tools:  honcho_profile, honcho_search, honcho_context, honcho_reasoning, honcho_conclude
Cost:   Paid (cloud) / Free (self-hosted, AGPL-3.0)

⚠️ Licensing note: OSS is AGPL v3.0. Self-hosting in a networked app requires releasing your source under AGPL. Using managed cloud avoids this.

Best if: You're building a personal assistant that should deepen its model of you over time, or running multi-agent systems with shared user context.

ByteRover

Your knowledge, stored as readable Markdown. No black boxes.

Hierarchical knowledge tree stored in .brv/context-tree/ as human-readable Markdown files. Unique pre-compression extraction hook fires before Hermes compresses long conversations, capturing knowledge before context gets summarized away.

Setup:  hermes memory setup → select ByteRover
Tools:  byterover_search, byterover_list, byterover_forget
Cost:   Freemium

Best if: You want full visibility into stored memory, or need to capture knowledge from long conversations before compression loses it.

RetainDB

Search nerd's pick. Hybrid vector + BM25 + reranking.

Combines multiple retrieval strategies for the highest-quality search results. Vector similarity catches semantic matches, BM25 catches exact keyword matches, and reranking puts the best results on top.

Setup:  hermes memory setup → select RetainDB
Tools:  retaindb_search, retaindb_store
Cost:   Paid

Best if: Retrieval quality is your top priority and you're willing to pay for it.

SuperMemory

Web research workflows. Browser-integrated memory.

Designed for memory that extends into the browser — captures and retrieves web content as part of your knowledge base.

Setup:  hermes memory setup → select SuperMemory
Cost:   See supermemory.ai pricing

Best if: Your workflow involves heavy web research and you want persistent memory of online content.

Cost Summary

Tier	Providers	Notes
Free, local	Holographic, Hindsight (local), OpenViking	No API keys, no cloud. Holographic is the easiest pick.
Free tier / freemium	Mem0, ByteRover	Start free, pay for higher limits
Paid cloud	Honcho, RetainDB, SuperMemory	Production features, team support
Always free (built-in)	MEMORY.md + USER.md	No setup, always active, 2200 + 1375 char limits

My Recommendations

Just getting started?
Stick with built-in memory. It covers 80% of use cases. Add an external provider only when you hit its limits.

Want the best free local experience?
Hindsight (local daemon). Best benchmarks, nothing leaves your machine, structured knowledge graph.

Want zero config?
Hogrpghic. Pick it in hermes memory setup and you're done. No API keys, no servers.

Want the easiest cloud setup?
Mem0. 30 seconds, free tier, hands-off extraction.

Running multi-agent or want deep user modeling?
Honcho. The dialectic reasoning is genuinely different from every other provider.

Care about token costs at scale?
OpenViking's tiered loading will save you 80-90% on tokens.

Migrating Between Providers

Switching is straightforward:

hermes memory setup      # pick new provider
hermes memory status     # confirm it's active

Your built-in memory (MEMORY.md, USER.md) stays intact regardless of which external provider you use. Note that external providers store data in their own backends — switching providers means starting fresh with the new one's knowledge base. There's no automated migration between providers yet.

Questions?

Drop them in the comments. I'm happy to help you pick the right setup for your use case.

推荐订阅源

DEV Community