惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

博客园 - 司徒正美
D
Darknet – Hacking Tools, Hacker News & Cyber Security
M
MIT News - Artificial intelligence
腾讯CDC
IT之家
IT之家
Microsoft Azure Blog
Microsoft Azure Blog
M
Microsoft Research Blog - Microsoft Research
阮一峰的网络日志
阮一峰的网络日志
H
Help Net Security
L
LangChain Blog
G
Google Developers Blog
Stack Overflow Blog
Stack Overflow Blog
人人都是产品经理
人人都是产品经理
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 【当耐特】
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
U
Unit 42
Recent Announcements
Recent Announcements
S
SegmentFault 最新的问题
大猫的无限游戏
大猫的无限游戏
博客园 - Franky
T
The Blog of Author Tim Ferriss
罗磊的独立博客
宝玉的分享
宝玉的分享
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
雷峰网
雷峰网
D
DataBreaches.Net
爱范儿
爱范儿
Schneier on Security
Schneier on Security
P
Palo Alto Networks Blog
Spread Privacy
Spread Privacy
Hugging Face - Blog
Hugging Face - Blog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
K
Kaspersky official blog
P
Privacy & Cybersecurity Law Blog
博客园_首页
T
Threat Research - Cisco Blogs
I
InfoQ
有赞技术团队
有赞技术团队
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Recorded Future
Recorded Future
量子位
H
Hackread – Cybersecurity News, Data Breaches, AI and More
GbyAI
GbyAI
Cyberwarzone
Cyberwarzone
B
Blog
C
Check Point Blog
P
Proofpoint News Feed
S
Securelist
A
Arctic Wolf

DEV Community

How I Built a Cinematic Scroll Experience with GSAP and ScrollTrigger I Built a Free Spelling Bee Solver and Analysis Tool — Here's What It Does Stop Over-Engineering Your UI: Material 3 for Blazor (Without the JS State Management Nightmare) I just created the best web FullStack framework in Rust language: the Rullst! I did with the help of AI, but my tokens are over, can you help me? ASF Project Spotlight: Apache Iceberg babelForge TIL 5/27/2026 Broken Software I built a CLI that scaffolds agentic workflows for Claude Code Testing a LiveView App with Playwright: Fixing Navigation Timeouts I Turned on Agent Tracing for 30 Days. 4 Hidden Bottlenecks Were Eating 47% of My Tokens. How I monitor CVEs daily with a 50-line Python script Apache Geode 2.0, Part II: Rebuilding a Distributed System for the Modern Java Era HiTerm: A Free Remote Terminal for AI Coding Agents (Claude Code, Codex, Gemini CLI) 5 Free Online Tools You Didn't Know About — No Signup How I Finished My AI Code Reviewer Using GitHub Copilot The auth_rls_initplan linter has a blind spot: SECURITY DEFINER bodies Upgrading OtakuShelf to JHipster 9.1.0 Polishing the catalog (and reading the agent's receipts) Adding the anime side without holding my breath Pairing up: scaffolding OtakuShelf with an agent State.js Tutorial: Creating Reusable UI Components with Pure CSS Reactivity Deskbrid: A Linux Desktop HAL Built Entirely by AI Agents The Day the Treasure Hunt Engine Drowned in 300 ms Queries How I Built a Marriage Biodata PDF Generator in Next.js Supply Chain & AI Security: GlassWorm Takedown, Prompt Injection RCE, Ubuntu 24 Hardening AI Agent Production Challenges: Failures, Starlette Vulnerability, Code Gen SQLite Bugfix, PostgreSQL Migrations & Filesystem API Paradigm CUDA 13.3 Lands, AI Writes Blackwell Kernels, & FP4 VRAM Optimization for LLMs How I Found a Fake Job Assessment Repo Hiding Malware Inside SVG Files Building the Pipes: Core Data Engineering Concepts Explained Ultimate 1-Minute Xray/3x-ui Setup: VLESS, Hysteria2, Caddy Self-Steal & Smart Outbounds in One Script [Boost] Vamos falar de IA. Mas de outro lugar. Stop Duplicating Code! Is "Integration Hell" Just Laziness or a Systemic Architecture Failure? The Burden of API Versioning: URI or Header? Troubleshooting oracle database replication issues. Building AI Agents for Compliance Monitoring in Finance: Architecture That Passes Auditors How to Ask for Crypto Support Without Exposing Secrets Your AI Coding ROI Is Disappearing and Your Dashboard Won't Tell You Are Companies Really Doing Layoffs "For AI"? The Internet Is for Agents How Hermes Agent Helped Me Ship an Indonesian NLP Parser in One Week I built DeepWrap: a Python SDK and CLI for DeepSeek Chat Your "Autonomous Agent" Is Just a Cron Job With Better Marketing Tensors Explained Part 1: How AI Systems Represent Data Stop pasting production JWTs into random online decoders. I Tried Building a Complex Security Tool with a 1.5B Local Model — Here's What Broke I Built an MCP Server for INDmoney — Ask Claude About Your Portfolio in Plain English # Como criei um varredura diária de papers de IA com Ollama + Telegram Inline Wikipedia for every article you read How I Built a Multi-Chain DEX Trading Bot with Hermes Agent as My Trading Partner Stop Debugging in the Dark: How to Build a Real-Time Control Room for Autonomous AI Agents Kafka sem duplicação – 2 padrões pra você dormir em paz What Building My Own AI Bot Taught Me About Generative AI Control Plane Sovereignty: Why Your AI Stack Probably Isn't Sovereign A Superpower Behind Smart Decisions: Python in Data Analytics Base64 explained — what it is, when to use it, and the gotchas that bite developers AI fatigue is very real and people are fighting back! I built 39 free browser-based dev tools — here's every decision I made and why BrowserRouter You Solved the Hard Technical Problems. Operational Debt Is What's Going to Kill Your Company. Pinpoint Answer Today: Claude Code vs Cursor vs Copilot — I Tested All 3 for 90 Days on Real Projects Hermes Autonomy Substrate: compiling my judgment into a removable approval gate I found the r/openclaw thread with 27 upvotes where someone gave an agent a real iPhone and now I can’t stop thinking about it INP in production: what we wish we had measured earlier How Traveling Shaped My Mind and Helped Me Respect Other Cultures Why 'Who Last Touched This File' Is the Wrong Question Time When More Layers Meant Worse Model ... Birth Of Residual Why I Built a Privacy-First Discord Alternative Beyond the Numbers: How Ada Lovelace Envisioned the Dawn of Symbolic Computation (1833–1834) Millwright-Inspector: A Methodology for Software Development with AI Coding Agents Build Your First Claude Skill: An Gmail-to-GDrive Receipt Filer in 20 Minutes When Preprocessing Helps-and When It Hurts: Why Your Image Classification Model's Accuracy Varies So Much Treasure Hunt Engine: How We Blew Up the Docs and Built a System That Actually Works The Blacklist Nightmare: How to Get Off Spam Lists Fast How I built a Bluesky scraper using the AT Protocol API (and published it on Apify) How to Prompt AI Coding Tools Like a Senior Dev (2026) The Moment the JVM Tuning Knob Broke Our Treasure Hunt Engine Most Software Is Workflow Design Zendesk Relate 2026 - What I learned Me encanto! Pure CSS 3D Cat A Practical Home Energy OS with Home Assistant BrowserRouter, Routes, Link, and useNavigate Developers keep failing AWS SAA-C03 for the same reason (and it's not lack of AWS knowledge) I Tested 5 AI Coding Tools for 30 Days — Here's What Actually Works LogicNodes: 2,316 Deterministic AI Workers via HTTP — No Signup Required The Better Primary Key: A Guide to ULIDs for Rails Developers Why Hytales Treasure Hunt Engines Explode Under Load (And How We Fixed It Without Losing Ourselves) Why I think AI tools should live closer to the browser workflow Stacks en entrevistas técnicas: 3 problemas resueltos paso a paso How Would I Build For Right Now How to Set Up a Clean Page Object Model (POM) in Selenium with Java Part 2: Replacing a 3.4MB video with 40kb of scripted GSAP animations: adding a camera Hospedando sites de graça na AWS + Cloudflare Google's Gemini 3.5 Flash is 4x faster than other frontier models. Here is how to call it from TypeScript. Week 1 I built a code runner for 14 languages - try to break it and test How to Analyze Your Google Analytics Data with AI: GA4 AI Agent Guide
Hermes Memory Providers: A Complete Breakdown for New Users
Shane Castil · 2026-05-28 · via DEV Community

Hermes has a lot of memory options. If you're new, the choices can be overwhelming — built-in memory, 8 external providers, different costs, different architectures. This guide breaks it all down so you can make the right call for your setup.


First: Built-In Memory (Always Active)

Before we talk providers, understand that built-in memory is always on. It doesn't cost anything, requires no setup, and works out of the box.

Two files in ~/.hermes/memories/:

File Purpose Char Limit
MEMORY.md Agent's notes — environment facts, project conventions, lessons learned 2,200 chars (~800 tokens)
USER.md User profile — your name, preferences, communication style 1,375 chars (~500 tokens)

Both are injected into the system prompt at the start of every session. The agent manages them automatically — it saves preferences you correct, environment facts it discovers, and conventions it learns.

Key details:

  • Entries are separated by § delimiters
  • The header shows usage % (e.g. MEMORY [67% — 1,474/2,200 chars])
  • Above 80% capacity, the agent should consolidate before adding
  • Duplicate entries are auto-rejected
  • Entries are scanned for injection/exfiltration patterns for security
  • Changes persist to disk immediately but appear in the system prompt at the next session (frozen snapshot — preserves LLM prefix cache)

For most new users, built-in memory is enough. It handles preferences, project facts, and daily workflow notes. You don't need an external provider for a personal assistant setup.

But you'll want one when:

  • You have multiple Hermes profiles that should share knowledge
  • You want the agent to learn and synthesize across sessions automatically
  • You're running long conversations that exceed context limits
  • You need structured knowledge retrieval (entities, relationships, not just text blobs)

The 8 External Memory Providers

All external providers are installed via:

hermes memory setup      # interactive picker
hermes memory status     # check what's active
hermes memory off        # disable

Enter fullscreen mode Exit fullscreen mode

Or set manually in ~/.hermes/config.yaml:

memory:
  provider: hindsight    # or any of the 8

Enter fullscreen mode Exit fullscreen mode

Important: Only one external provider can be active at a time. All of them layer on top of built-in memory — they don't replace it.


Quick Comparison

Provider Storage Cost Unique Angle Best For
Hindsight Local/Cloud Free (local) Knowledge graph + reflect synthesis Highest accuracy, privacy
Holographic Local SQLite Free HRR algebra + trust scoring, zero deps Air-gapped, zero-install
OpenViking Self-hosted Free (AGPL) Tiered L0/L1/L2 loading, 80-90% token savings Self-hosted teams, cost optimization
Mem0 Cloud Freemium Server-side LLM extraction, dual memory scope Fastest setup
Honcho Cloud/Self Paid (cloud) / Free (self-hosted) Dialectic user modeling Multi-agent, deep user understanding
ByteRover Local/Cloud Freemium Knowledge tree in human-readable Markdown Pre-compression knowledge capture
RetainDB Cloud Paid Hybrid search: vector + BM25 + reranking Production search quality
SuperMemory Cloud Web-focused memory with browser integration Web research workflows

Benchmark Snapshot

Only two providers have published LongMemEval scores:

Provider Score Model
Hindsight 91.4% Gemini-3
Hindsight 89.0% Open-source 120B
Mem0 67.6% GPT-4o (LongMemEval-S variant)

Hindsight is the clear retrieval accuracy leader. Others haven't published comparable benchmarks.


Provider Deep Dives

🥇 Hindsight

The best all-around choice for most users who want local + accurate.

Stores structured knowledge — discrete facts, named entities, and relationships — not raw text chunks. Its unique hindsight_reflect tool periodically synthesizes higher-level insights across all memories. Think of it as the agent building a personal knowledge graph over time.

Setup:  hermes memory setup → select Hindsight
        Leave blank for local daemon, or set HINDSIGHT_API_KEY for cloud
Tools:  hindsight_recall, hindsight_retain, hindsight_reflect
Cost:   Free (local PostgreSQL daemon) / Cloud available for teams

Enter fullscreen mode Exit fullscreen mode

Best if: You want the highest retrieval accuracy, need structured knowledge, or handle privacy-sensitive data.


Holographic

Zero dependencies. Nothing leaves your machine. Literally two tools and done.

Uses Holographic Reduced Representations (HRR) — memories stored as superposed complex-valued vectors. Recall is algebraic, not similarity-based. A trust-scoring mechanism causes confirmed memories to gain weight and contradicted ones to decay over time.

Setup:  hermes memory setup → select Holographic. That's it. No API keys.
Tools:  2 tools (minimal by design)
Cost:   Free. Local SQLite. Period.

Enter fullscreen mode Exit fullscreen mode

Best if: You're in an air-gapped environment, hate external dependencies, or want self-correcting memory that learns what's trustable.


OpenViking

The token-saver. Self-hosted context database from ByteDance.

Its filesystem-style hierarchy with tiered loading is the standout feature:

  • L0 (Abstract): ~100 tokens — loaded every turn
  • L1 (Overview): ~2k tokens — loaded when planning
  • L2 (Full): Complete content — loaded only when deep context needed

This means 80-90% token cost reduction vs. loading full context every turn. Auto-extracts memories into 6 categories: profile, preferences, entities, events, cases, patterns.

Setup:  pip install openviking
        openviking-server
        hermes memory setup → select OpenViking
        Set OPENVIKING_ENDPOINT=http://localhost:1933
Tools:  viking_search, viking_read, viking_browse, viking_remember, viking_add_resource
Cost:   Free (AGPL-3.0, self-hosted)

Enter fullscreen mode Exit fullscreen mode

Best if: You're running at scale, want self-hosted infrastructure, or need to minimize token costs.


Mem0

The "just make it work" option. 30 seconds to running.

Server-side LLM extraction means Mem0's infrastructure decides what to keep. Includes a circuit breaker so memory failures don't block agent responses. Dual memory scope (session + user) means it separates short-term context from long-term facts.

Setup:  hermes memory setup → select Mem0
        Set MEM0_API_KEY=your-key
Tools:  mem0_add, mem0_search, mem0_get_all
Cost:   Freemium (free tier available)

Enter fullscreen mode Exit fullscreen mode

Best if: You want the fastest setup, don't want to self-host, and are okay with cloud storage. Good starting point — you can always migrate later.


Honcho

The philosopher. Builds a model of how you think, not just what you know.

Dialectic user modeling captures reasoning patterns, communication style, and decision-making tendencies over time. Two-layer context injection with configurable cadences for refreshes. Supports multi-agent setups with separate AI peers per Hermes profile.

Setup:  hermes memory setup → select Honcho
        Set HONCHO_API_KEY=your-key
Tools:  honcho_profile, honcho_search, honcho_context, honcho_reasoning, honcho_conclude
Cost:   Paid (cloud) / Free (self-hosted, AGPL-3.0)

Enter fullscreen mode Exit fullscreen mode

⚠️ Licensing note: OSS is AGPL v3.0. Self-hosting in a networked app requires releasing your source under AGPL. Using managed cloud avoids this.

Best if: You're building a personal assistant that should deepen its model of you over time, or running multi-agent systems with shared user context.


ByteRover

Your knowledge, stored as readable Markdown. No black boxes.

Hierarchical knowledge tree stored in .brv/context-tree/ as human-readable Markdown files. Unique pre-compression extraction hook fires before Hermes compresses long conversations, capturing knowledge before context gets summarized away.

Setup:  hermes memory setup → select ByteRover
Tools:  byterover_search, byterover_list, byterover_forget
Cost:   Freemium

Enter fullscreen mode Exit fullscreen mode

Best if: You want full visibility into stored memory, or need to capture knowledge from long conversations before compression loses it.


RetainDB

Search nerd's pick. Hybrid vector + BM25 + reranking.

Combines multiple retrieval strategies for the highest-quality search results. Vector similarity catches semantic matches, BM25 catches exact keyword matches, and reranking puts the best results on top.

Setup:  hermes memory setup → select RetainDB
Tools:  retaindb_search, retaindb_store
Cost:   Paid

Enter fullscreen mode Exit fullscreen mode

Best if: Retrieval quality is your top priority and you're willing to pay for it.


SuperMemory

Web research workflows. Browser-integrated memory.

Designed for memory that extends into the browser — captures and retrieves web content as part of your knowledge base.

Setup:  hermes memory setup → select SuperMemory
Cost:   See supermemory.ai pricing

Enter fullscreen mode Exit fullscreen mode

Best if: Your workflow involves heavy web research and you want persistent memory of online content.


Cost Summary

Tier Providers Notes
Free, local Holographic, Hindsight (local), OpenViking No API keys, no cloud. Holographic is the easiest pick.
Free tier / freemium Mem0, ByteRover Start free, pay for higher limits
Paid cloud Honcho, RetainDB, SuperMemory Production features, team support
Always free (built-in) MEMORY.md + USER.md No setup, always active, 2200 + 1375 char limits

My Recommendations

Just getting started?
Stick with built-in memory. It covers 80% of use cases. Add an external provider only when you hit its limits.

Want the best free local experience?
Hindsight (local daemon). Best benchmarks, nothing leaves your machine, structured knowledge graph.

Want zero config?
Hogrpghic. Pick it in hermes memory setup and you're done. No API keys, no servers.

Want the easiest cloud setup?
Mem0. 30 seconds, free tier, hands-off extraction.

Running multi-agent or want deep user modeling?
Honcho. The dialectic reasoning is genuinely different from every other provider.

Care about token costs at scale?
OpenViking's tiered loading will save you 80-90% on tokens.


Migrating Between Providers

Switching is straightforward:

hermes memory setup      # pick new provider
hermes memory status     # confirm it's active

Enter fullscreen mode Exit fullscreen mode

Your built-in memory (MEMORY.md, USER.md) stays intact regardless of which external provider you use. Note that external providers store data in their own backends — switching providers means starting fresh with the new one's knowledge base. There's no automated migration between providers yet.


Questions?

Drop them in the comments. I'm happy to help you pick the right setup for your use case.