惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Security Latest
Security Latest
U
Unit 42
D
Docker
H
Help Net Security
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Microsoft Azure Blog
Microsoft Azure Blog
C
Cisco Blogs
阮一峰的网络日志
阮一峰的网络日志
S
Schneier on Security
Project Zero
Project Zero
F
Future of Privacy Forum
V
Vulnerabilities – Threatpost
Recent Announcements
Recent Announcements
T
Threatpost
T
True Tiger Recordings
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Recorded Future
Recorded Future
T
The Blog of Author Tim Ferriss
S
SegmentFault 最新的问题
A
Arctic Wolf
Martin Fowler
Martin Fowler
I
InfoQ
Malwarebytes
Malwarebytes
T
Tor Project blog
Hugging Face - Blog
Hugging Face - Blog
M
MIT News - Artificial intelligence
S
Securelist
T
Tailwind CSS Blog
Blog — PlanetScale
Blog — PlanetScale
P
Proofpoint News Feed
W
WeLiveSecurity
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
H
Hacker News: Front Page
The Cloudflare Blog
O
OpenAI News
C
CERT Recently Published Vulnerability Notes
Hacker News: Ask HN
Hacker News: Ask HN
NISL@THU
NISL@THU
E
Exploit-DB.com RSS Feed
Scott Helme
Scott Helme
Jina AI
Jina AI
Spread Privacy
Spread Privacy
T
The Exploit Database - CXSecurity.com
T
Troy Hunt's Blog
N
News | PayPal Newsroom
李成银的技术随笔

DEV Community

A Field Guide to Human–AI Relations (For the Newly Bewildered Mortal) Gemma 4: Google's Open-Weight AI Is a Game Changer for Developers Confessions of a Git Beginner: Why the Terminal Stopped Scaring Me 🚀 I Built a Full Stack Miro Clone with Real-Time Collaboration using Next.js Building an African Economic Data Pipeline with Python, DuckDB & World Bank API llms.txt vs robots.txt vs ai.txt: The Developer's Cheat Sheet Intigriti Challenge 0526 Writeup Business Logic Flaws: How Attackers Skip Steps in Your App to Get What They Should Never Have Why Vibe Coders Need Boilerplates to Save Time, Tokens, and Build More Secure SaaS Projects Idle Cloud Cost Is the New Egress Cost Quark's Outlines: Python Traceback Objects Ghost in the Stack (Part 1): Why uninitialized variables remember old data Building a High-Performance Local Chess Assistant Extension with WebAssembly Stockfish and Manifest V3 Breaking the Trade-off Between Self-Custody and Intelligent Automation on the Stellar Network I Open-Sourced a Practical Fullstack Interview Preparation Repository (React + Node + System Design) 🚀 How I Started Coding as a Student (Beginner-Friendly Guide) WordPress vs. Ghost: Why Automated Bot Attacks Are Making us think much I tested 4 AI agent-governance tools against an open spec - here's the matrix zkML Inference Proof: What the Receipt Proves, and What the Model Still Does Not I Scored 1000/1000 on AWS Certified AI Practitioner (AIF-C01) Here's Every Resource I Used Go - Struct and Interface Handling JSON Requests in Go Storing Kamal secrets in AWS Secrets Manager and deploying to a cheap Hetzner VPS How I Caught and Fixed an N+1 Query in My Django REST API I got tired of paying $10/month to remove image backgrounds – so I built it for free How to Start Coding as a Student: A Complete Beginner’s Guide 🚀 Storing Kamal secrets in AWS Secrets Manager and deploying to a cheap Hetzner VPS What Are Buffers? Build AI Agents with Hot Dev The Client Onboarding Checklist That Prevents 90% of Project Problems Scalable Treasure Hunts Are a Myth, But We Almost Made One Gemini 3.5 Flash Has a 1M Token Context Window. Here's What You Can Actually Build With It. I built a ultra-polished developer portfolio template using React & Tailwind v4 (with zero-JSX configuration) Gemini CLI Is Dead. Here's the Better Thing That Replaced It Post-quantum cryptography for embedded and IoT: secure boot, TLS and OTA Understanding Optimistic Preloading in Modern Applications Nobody Wants to Read Your Code (And You Don't Want to Read Theirs) A clothing pairing app E2B vs E4B vs 31B Dense: The Practical Guide to Choosing the Right Gemma 4 Model I built an AI app store screenshot generator because Figma made me cry — looking for brutal feedback Hello DEV Community — My Developer Journey Begins Adaptable apps on ChromeOS: a post-mortem The WordPress Paradox: Why It’s Here to Stay (and How to Stop Ruining It) I built a local voice AI that can change to 9 different personalities! UXRay: I Built an AI That Roasts Your UI Like a Senior Designer Would Wyrly DI: Type-safe Dependency Injection for Modern TypeScript The contract is the interface: agent-driven Steampipe Stave in one command Gemma 4's Hidden Superpower: Why Built-in Thinking Tokens Change Everything for Evaluation Tasks ⚡ WordPress Performance: The Real Truth They Don't Tell You A Mobile App Usually Needs an Admin System First Customer Portals Should Remove Repeated Admin Work Episode 4: The Time Loop (Layers & Caching) I Built ContextForge with Gemma 4: A Project Memory Generator for Developers and AI Coding Agents Why shadow DOM beat iframe for inline tooltips HOW TO CREATE USER AND ASSIGN ROLES IN AZURE WITH ENTRA ID When AI Blackmail Goes Viral Episode 3: The Secret Scroll (The Dockerfile) Monte Carlo Simulation for Engineers: Turning Uncertainty Into Numbers The tokens-per-byte trap: character-level 'compression' adds tokens Nobody Reads Your Code Anymore Why I built a collection of 5 free, zero-signup career finance tools for solo builders 🚀 New React Challenge: Instant UI with useOptimistic Resolvendo a Alucinação da IA na Arquitetura de Software com Code Property Graphs e .NET 9 S1 — Clean Backtrace Crashes: How to Diagnose and Fix Them Cómo solucionar el bucle infinito en useEffect con objetos y arrays The Brutal Reality of Running Gemma 4 Locally I made Claude Code refuse to write code unless the ticket scores 80/100 I Fed React's Entire Hooks Transition History to Gemma 4. Here's What It Found That We Missed. Building a Private RAG System: Lessons from a Local-First AI Journal CodePulse AI — Reviving an AI-Powered Repository Intelligence Platform How to Split Video into Segments with FFmpeg (CLI + API) I've audited dozens of estate agency websites. The same 5 problems show up every single time. Part 1: Taming Asynchronous JavaScript: How to Build a "Mailbox" Queue Building My AI-Powered VS Code Extension 🚀 Google Login in Express with PassportJS & JWT Great example of Gemma 4 moving beyond chatbots into real-world decision support. Using AI to guide everyday actions like recycling shows how impactful applied LLMs can be when designed for usability, not just capability. #Gemma4 #AI #Sustainability Building a Production AI Chatbot for an Educational Institute: Architecture, Lessons & Full Stack Deep-Dive Google Login in Express with PassportJS & JWT How I reclaimed 47GB on my MacBook by cleaning developer project junk Operators Are Not Oracles: How We Learned to Stop Worrying and Love the Configuration I Built 6 Free Developer Tools for AI APIs, Cron, Docker, and Self-Hosting How I Built a Real-Time Precious Metals Price Feed for 30,000 Concurrent Users in Laravel How to Use a SERP API to Validate Whether a Project Idea Is Worth Building Gemma 4 discussions often focus on capability, but real-world impact depends on deployment context. For offline education, especially in low-connectivity regions, latency, cost, and local inference matter as much as model strength. Local Mind Explores it Space Complexity + Ω and Θ Notations Google I/O 2026 Just Confirmed the Shift From AI Chatbots to AI Agents How to Add API Monitoring to an Express App in 5 Minutes (2026) Designing an In-Game Inflation Tracking Algorithm for Web Utility Apps Google AI Studio Just Changed the Shape of App Development If you struggle to learn then this is for you. Best AI Agent Security & Guardrails Tools in 2026: LLM Guard vs NeMo vs Guardrails AI Building Dynamic RBAC in React 19: From Permission Strings to Component-Level Access Control How to Build a Self-Hosted AI Code Review Tool in Python Why We Switched from React to HTMX in Production: A 200-Site Case Study Gemma-Loom: The Intent-Based Virtual Machine (IVM) for Edge Sovereignty Java实习海投攻略:3天300个沟通,我是怎么拿到面试的 I Deployed Netflix's Web Server in 30 Seconds (And So Can You) - Docker Project 1 Debugging Android 14 WebRTC Disconnects on a coturn Relay Path 1/30 Days System Design Question Testing FastAPI + SQLAlchemy with Real PostgreSQL Fixtures: No More Mocking Misery
The AI Agent That Learns While It Works — A Complete Guide to Hermes Agent
Aditya · 2026-05-23 · via DEV Community

This is a submission for the Hermes Agent Challenge


Most AI Agents Are Goldfish. Hermes Is Different.

Let me describe the standard agentic experience of 2024–2025.

You open a terminal, give an agent a task, watch it spin through a dozen steps, and feel genuinely impressed — right up until you close the session. Tomorrow, you start completely from scratch. The agent has no memory of what it learned, no idea who you are, and zero awareness that it made the same mistake three sessions ago.

You're its first user. Every single time.

"We've been shipping amnesia as a feature and calling it 'stateless architecture.' The emperor has no clothes, and the emperor can't remember what clothes were."

This is the problem that Hermes Agent, built by Nous Research, is genuinely trying to solve. Not with a wrapper around an existing API. Not with a clever prompt. With a fundamentally different architecture: a closed learning loop baked into the agent itself.

The longer Hermes runs, the more capable it becomes — at working with you, specifically.


What Is Hermes Agent, Really?

Before we get into setup, the architecture, or how it compares to other frameworks — it's worth being clear about what Hermes actually is, because it doesn't fit neatly into existing categories.

It's not a coding copilot tethered to an IDE.
It's not a chatbot wrapper around a single API.
It's not a rigid workflow automation engine.

It's an autonomous agent that lives wherever you put it — a $5 VPS, a GPU cluster, or serverless infrastructure that costs nearly nothing when idle. You can talk to it from Telegram while it works on a cloud VM you never SSH into yourself. It runs on your infrastructure. You own the runtime and the data.

Here's a quick snapshot of what ships in the box:

Stat Number
Built-in tools 70+
Messaging platform integrations 20+
Terminal execution backends 6
License MIT

Part 1: Getting Started — From Zero to Running in Under 5 Minutes

Step 1 — Install

One command. Works on Linux, macOS, WSL2, and Android via Termux.

# Linux / macOS / WSL2 / Android (Termux)
curl -fsSL https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.sh | bash

# Reload your shell after install:
source ~/.bashrc   # or source ~/.zshrc

# Windows (PowerShell, early beta):
irm https://raw.githubusercontent.com/NousResearch/hermes-agent/main/scripts/install.ps1 | iex

Enter fullscreen mode Exit fullscreen mode

That's the whole install step. No dependency hunting. No pip install hell. The script handles everything and places the agent at ~/.hermes/hermes-agent.

Step 2 — Choose Your AI Provider

This is the most important setup step. Run hermes model for an interactive selection menu. Here are the main options worth knowing:

Provider Best For Setup Method
Nous Portal Zero-config start, subscription-based OAuth via hermes model
OpenRouter Multi-model experimentation API key
Anthropic Claude models OAuth (Max plan) or API key
GitHub Copilot Using your existing subscription OAuth via hermes model
Custom Endpoint Local models via Ollama / VLLM / llama.cpp Base URL + API key

⚠️ Critical: Hermes requires a model with at least 64,000 tokens of context. Most hosted models (Claude, GPT, Gemini, Qwen, DeepSeek) meet this easily. If you're running locally, set your context size to at least 64K (e.g., --ctx-size 65536 in llama.cpp).

You can switch providers at any time with hermes model — no lock-in.

Step 3 — Your First Session

hermes          # classic CLI
hermes --tui    # modern TUI with overlays and mouse support (recommended)

Enter fullscreen mode Exit fullscreen mode

You'll see a welcome banner showing your provider, model, available tools, and loaded skills. Start with something easy to verify:

Summarize this repo in 5 bullets and tell me the main entrypoint.

Enter fullscreen mode Exit fullscreen mode

Check my current directory and tell me what looks like the main project file.

Enter fullscreen mode Exit fullscreen mode

What success looks like:

  • Banner shows your chosen model and provider
  • Hermes replies without error
  • It uses a tool when needed (terminal, file read, web search)
  • The conversation continues normally for more than one turn

Step 4 — Verify Sessions Work

This matters more than most tutorials mention:

hermes --continue   # Resume the most recent session
hermes -c           # Short form

Enter fullscreen mode Exit fullscreen mode

If this works, you have persistent sessions. That's the foundation everything else is built on.

Step 5 — Add the Next Layer

Only after the base chat works. Don't skip ahead.

Messaging platforms (Telegram, Discord, Slack, WhatsApp, Signal, Email, and 15+ more):

hermes gateway setup

Enter fullscreen mode Exit fullscreen mode

Skills — structured knowledge documents the agent loads on demand:

hermes skills search kubernetes
hermes skills install openai/skills/k8s

Enter fullscreen mode Exit fullscreen mode

MCP servers — add to ~/.hermes/config.yaml:

mcp_servers:
  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "ghp_xxx"

Enter fullscreen mode Exit fullscreen mode

Docker sandbox — for safety on production work:

hermes config set terminal.backend docker

Enter fullscreen mode Exit fullscreen mode

Voice mode:

cd ~/.hermes/hermes-agent
uv pip install -e ".[voice]"
# Then inside a session: /voice on  (Ctrl+B to record)

Enter fullscreen mode Exit fullscreen mode

💡 Pro tip: Run hermes doctor any time something feels off. It diagnoses config problems and tells you exactly what to fix. Don't add features until hermes doctor is clean.


Part 2: How the Learning Loop Actually Works

This is the part that separates Hermes from the rest of the field. Let me walk through what actually happens under the hood when you use Hermes over time.

The Five-Stage Learning Loop

Stage 1 — Context Loading
Before the agent responds to anything, it loads MEMORY.md (persistent facts about you and your projects) and USER.md (a behavioral model of who you are). It also discovers and loads any context files in your project directory — .hermes.md, AGENTS.md, CLAUDE.md, SOUL.md. The agent starts every session knowing your history.

Stage 2 — Tool Selection and Multi-Step Planning
From 70+ built-in tools, Hermes selects what the task needs. It can spawn subagents via delegate_task — up to 3 concurrent child agents by default, each with isolated context, restricted toolsets, and their own terminal sessions. This is how it parallelizes complex work without the threads stepping on each other.

Stage 3 — Skill Creation for Novel Tasks
When Hermes successfully completes a task it hasn't done before, it can write a reusable Skill document. Next time it faces a similar problem — even in a different session — it loads the Skill and executes efficiently without rediscovering the approach from scratch.

Stage 4 — Memory Consolidation
After sessions, Hermes uses FTS5 full-text search with LLM summarization to curate what's worth keeping. It doesn't dump raw logs into memory — it actively decides what to remember. This keeps memory bounded and useful even after hundreds of sessions. It also uses Honcho's dialectic user modeling to build a deepening picture of who you are across time.

Stage 5 — Self-Improvement
Skills created in previous sessions are eligible for improvement. Hermes can notice when an old Skill isn't working optimally and update it mid-use. The agent gets measurably better at your specific workflows over weeks and months.

This is a closed loop. Most agents have none of it.

The execute_code Power Move

This is the feature that surprised me most. The execute_code tool lets Hermes write Python scripts that call its own tools programmatically, via sandboxed RPC execution:

# Instead of: search → wait → read → wait → summarize → wait → write
# Hermes collapses this into a single LLM turn:

async def research_pipeline(topic):
    results = await tools.web_search(query=topic, n=10)
    pages = [await tools.read_url(r.url) for r in results[:3]]
    summary = await tools.summarize(pages, style="technical")
    await tools.write_file("research.md", summary)
    return summary

Enter fullscreen mode Exit fullscreen mode

This dramatically reduces the token cost of multi-step pipelines. Instead of burning inference tokens on "I will now do step 3 of 7," Hermes writes the whole pipeline as code and runs it. The LLM is only involved at the decision point, not at every mechanical step.

Terminal Backends: It's Not Tied to Your Laptop

Six backends let you separate where you talk to Hermes from where it actually runs:

  • Local — direct execution, fast, simple
  • Docker — sandboxed isolation, the right choice for production work
  • SSH — talk locally, execute on a remote server
  • Daytona / Modal — serverless; the environment hibernates when idle and costs nearly nothing

The Modal and Daytona backends are worth understanding. You can talk to your Hermes agent from your phone via Telegram while the agent's actual work runs on a cloud VM. The environment sleeps when you're not using it. For a personal always-on assistant this changes the economics completely.

The Skills System and agentskills.io

Skills are structured knowledge documents — procedures the agent loads on demand. They follow progressive disclosure: Hermes loads just the skill index first, then drills into full skill content only if the task requires it. Token usage stays low even in long sessions.

Skills are compatible with the open agentskills.io standard — meaning skills you write for Hermes are portable and shareable with the community, and community skills work with your setup without any conversion.


Part 3: How Hermes Stacks Up Against Other Agentic Frameworks

There are serious alternatives in the open-source agent ecosystem. Here's an honest look at where Hermes fits.

Feature Hermes Agent AutoGen CrewAI OpenDevin
Persistent cross-session memory ✅ Native ⚠️ Limited
Autonomous skill creation ✅ Built-in
Multi-step tool use
Messaging platform gateway ✅ 20+ platforms
Runs on $5 VPS / serverless ✅ Yes ⚠️ Possible ⚠️ Possible ❌ GPU needed
Multi-agent delegation ✅ Subagents ✅ Core feature ✅ Crews / Flows ⚠️
Local / self-hosted LLM ✅ Any endpoint
Voice mode ✅ CLI + messaging
MCP server support ⚠️ Via plugins ⚠️ Via plugins
Primary focus Personal autonomous agent Multi-agent orchestration Role-based agent teams Software engineering

Plain-English Breakdown

AutoGen is brilliant if you want fine-grained control over agent-to-agent communication patterns. But it's a framework you orchestrate — not a ready-to-run agent. You write the coordination logic yourself.

CrewAI makes multi-agent teamwork feel intuitive — define crews with roles, let them coordinate. Great for structured pipelines. Less suited for the kind of open-ended "figure it out" sessions where Hermes excels.

OpenDevin is purpose-built for software engineering tasks: browsing, code execution, file editing. Excellent in its lane. That lane is narrower than Hermes's.

Hermes is the agent you'd deploy as your actual daily assistant — one that knows your name, remembers your projects, and gets measurably better at helping you over weeks and months. That's a genuinely different product from the others.

When NOT to use Hermes: If you need a complex multi-agent pipeline with dozens of specialized roles that coordinate on a strict workflow, AutoGen or CrewAI give you more structured control. If your only use case is automated software engineering on repos, Claude Code or OpenDevin are sharper tools. Hermes shines brightest as a personal, persistent, do-everything agent — not a single-purpose workflow engine.


Part 4: What Open Agentic Systems at This Capability Level Actually Mean

I want to say something that might be obvious, but I haven't seen it written plainly.

When your agent genuinely knows you — your preferences, your projects, your quirks, your bad habits — you need to be the one in control of that knowledge. Hermes runs on your infrastructure. You own what it learns about you.

That changes the trust calculus entirely.

Most AI products in this space are cloud-hosted services with "open-source" labels slapped on for marketing. You're renting access to an agent that lives on their servers, stores its state in their database, and disappears if they change their pricing. Hermes runs on a $5 VPS you control, hibernates when idle via Modal or Daytona, and costs nearly nothing when you're not actively using it. The economics and the ownership model are completely different.

The second thing worth saying: agents that compound matter more than agents that are capable on day one.

The tools with the most features in their initial release rarely win. The tools that get measurably better at working with you over time, that reduce the friction of repeated patterns, that remember what you learned together last Tuesday — those are the ones that stick.

Hermes is one of the very few systems being built with that end state explicitly in mind. Not as a future roadmap item. As a first-class architectural property, shipping today, MIT license, running on your hardware.

That's worth paying attention to.


Quick-Start Checklist

Before you go — here are the six things to actually do today:

  • [ ] Run the one-line installer
  • [ ] Run hermes model and pick a provider
  • [ ] Launch hermes --tui and complete your first conversation
  • [ ] Test session resume with hermes --continue
  • [ ] Install a skill with hermes skills search ...
  • [ ] Share what you built on DEV.to for the challenge 🎉

Resources


Built during the Hermes Agent Challenge, May 2026. If you found this useful, I'd love to see what you build — drop it in the comments.