I Built the Hermes + Claude Code Dual-Stack: Orchestrator Meets Coder — Here's the Full Architecture

The framing showed up in a few places at the same time. A thread on Indie Hackers, a post on browseract, some Discord chatter: Hermes Agent as the orchestrator, Claude Code as the coder. Two agents, one stack, specialization instead of generalization. The framing made sense to me immediately. What nobody published was a working architecture — actual config files, the MCP bridge that connects them, and the failure modes you hit when you try to run this in production.

I spent two weeks building and debugging this dual-stack setup. I run it daily now. This is the complete architecture: VPS side, local side, the MCP bridge between them, six production patterns, and an honest accounting of what broke and what stuck.

Why Dual-Stack at All

The single-agent problem is real. I was running Claude Code as my primary agent for everything — code generation, git operations, GitHub issue management, cron-like scheduling via shell scripts. It works, but there are gaps. Claude Code is exceptional at code generation, file editing, and TypeScript/React tasks. It is not designed for persistent messaging, Telegram integration, or scheduling. Every time I wanted a notification when a task completed, I had to wire something up manually. Every time I wanted to trigger a code task from my phone, I was doing gymnastics.

Hermes is the inverse. It is excellent at persistent orchestration — running tasks on a schedule, consuming messages from Telegram or Discord, maintaining memory across sessions, coordinating multi-step workflows that span minutes or hours. It is not a specialized coder. When Hermes writes code, it is using a general-purpose model for a task where Claude Code has been specifically optimized.

The dual-stack argument is simple: let each agent do what it does best and bridge them. The complexity cost is real — two agent runtimes, a bidirectional MCP bridge, synchronized config — but the capability gain justifies it if you are building in public or running a product that needs both coding and operational automation.

Architecture Overview

The stack has three components:

Hermes on VPS — always-on, receives Telegram messages, runs cron jobs, maintains memory, and dispatches work
Claude Code on local machine — has full codebase access, IDE integration, runs tests, writes and edits files
MCP bridge — bidirectional: Claude Code calls Hermes tools, Hermes calls Claude Code tools

The key architectural decision is that Hermes runs on the VPS permanently. It is the always-on component. It receives messages from the outside world and decides what to do with them. Claude Code runs on my laptop where the codebase lives. It is the coding workhorse. The MCP bridge connects them so each can use the other as a tool.

Here is the data flow for the most common workflow — a Telegram message triggering a code change:

Phone (Telegram) → Hermes (VPS) → MCP bridge → Claude Code (local) → filesystem → git → GitHub → Hermes → Telegram → Phone

The notification loop is closed. I send a message from my phone, the code change happens, and I get a confirmation back on my phone. No manual steps in between.

VPS Side Setup

Hermes runs on a Hostinger VPS (Ubuntu 22.04, 4GB RAM). It starts via systemd on boot and stays running. The full config:

# ~/.hermes/config.yaml
# Hermes v0.13.0 — VPS production config

model: anthropic/claude-sonnet-4-6

systemPrompt: |
  You are a persistent orchestration agent running on a VPS. Your role is:
  1. Receive and triage messages from Telegram
  2. Maintain a task queue and dispatch coding tasks to Claude Code via MCP
  3. Monitor task completion and send confirmations back via Telegram
  4. Run scheduled workflows at their designated times

  IMPORTANT: Do not write code directly. For any code generation, file editing,
  or git operations, use the claude_code tool. You are the orchestrator; Claude
  Code is the coder. Your job is planning and coordination.

models:
  router:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0
    use_for: task_classification
  summarizer:
    provider: anthropic
    model: claude-haiku-4-5-20251001
    temperature: 0.0
    use_for: result_summarization

messaging:
  telegram:
    token: "${TELEGRAM_BOT_TOKEN}"
    allowed_chat_ids:
      - "${TELEGRAM_OWNER_CHAT_ID}"
    on_message:
      route_to: task_queue
      confirm_receipt: true

memory:
  backend: sqlite
  path: ~/.hermes/memory.db
  retention_days: 90

mcpServers:
  filesystem:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-filesystem@1.9.2", "/root/storefront"]
    capabilities:
      prompts: false
      resources: true
      tools: true

  git:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-git@0.6.2"]
    env:
      GIT_AUTHOR_NAME: "Hermes Orchestrator"
      GIT_AUTHOR_EMAIL: "hermes@wowhow.cloud"
    capabilities:
      prompts: false
      resources: false
      tools: true

  github:
    command: npx
    args: ["-y", "@modelcontextprotocol/server-github"]
    env:
      GITHUB_PERSONAL_ACCESS_TOKEN: "${GITHUB_TOKEN}"
    include:
      - list_issues
      - get_issue
      - create_issue
      - add_issue_comment
      - list_pull_requests
      - create_pull_request
      - get_pull_request
    capabilities:
      prompts: false
      resources: false
      tools: true

  claude_code:
    command: claude
    args: ["mcp", "serve"]
    description: "Claude Code — use for all code generation, TypeScript, React, file edits, and git commits"
    include:
      - edit_file
      - create_file
      - read_file
      - run_bash_command

serve:
  transport: stdio
  name: hermes-orchestrator
  description: |
    Hermes persistent orchestration agent. Accepts natural language task
    descriptions, manages a Telegram gateway, runs cron workflows, and
    delegates code generation to Claude Code.

The cron jobs live in a separate file to keep the main config readable:

# ~/.hermes/cron.yaml
jobs:
  morning-triage:
    # 7 AM IST = 1:30 AM UTC on weekdays
    schedule: "30 1 * * 1-5"
    task: |
      Pull the last 24 hours of open GitHub issues and pull requests.
      Categorize them by priority: blocking, needs-review, backlog.
      Create a summary and send it to Telegram with counts and top 3 items.
    model: anthropic/claude-haiku-4-5-20251001
    max_steps: 12
    on_failure:
      notify: telegram

  pr-triage:
    # Every 4 hours
    schedule: "0 */4 * * *"
    task: |
      Check for pull requests that have been open more than 48 hours without
      a review. Post a reminder comment on each one. Do not create new PRs.
    model: anthropic/claude-haiku-4-5-20251001
    max_steps: 8
    dry_run: false

  weekly-digest:
    # Sunday 9 PM IST = 3:30 PM UTC
    schedule: "30 15 * * 0"
    task: |
      Generate a weekly summary: commits merged, issues closed, PRs opened.
      Query GitHub for the past 7 days. Format as markdown and send to Telegram.
    model: anthropic/claude-sonnet-4-6
    max_steps: 15

Start cron runner as a daemon:

hermes cron start --config ~/.hermes/cron.yaml --daemon

The systemd unit file that keeps Hermes running on the VPS:

# /etc/systemd/system/hermes.service
[Unit]
Description=Hermes Orchestration Agent
After=network.target

[Service]
Type=simple
User=root
WorkingDirectory=/root
ExecStart=/usr/local/bin/hermes serve --config /root/.hermes/config.yaml
Restart=on-failure
RestartSec=10
Environment=ANTHROPIC_API_KEY=your-key-here
Environment=TELEGRAM_BOT_TOKEN=your-token-here
Environment=TELEGRAM_OWNER_CHAT_ID=your-chat-id-here
Environment=GITHUB_TOKEN=your-github-token-here

[Install]
WantedBy=multi-user.target

systemctl enable hermes
systemctl start hermes
systemctl status hermes

Local Side Setup

Claude Code runs on my MacBook where the codebase lives. The configuration that registers Hermes as an MCP server goes in ~/.claude.json (global, so it is available in every project):

# ~/.claude.json
{
  "mcpServers": {
    "hermes": {
      "command": "ssh",
      "args": [
        "-T",
        "root@your-vps-ip",
        "hermes mcp serve --config /root/.hermes/config.yaml"
      ],
      "description": "Hermes orchestration agent on VPS — use for Telegram messaging, scheduled tasks, GitHub operations, and multi-step workflows. Do not use for code generation or file editing."
    }
  }
}

The SSH transport is the key implementation detail here. Claude Code spawns the MCP server by running the command + args and communicating via stdio. By using ssh -T, the stdio of the local Claude Code process connects via SSH to Hermes running on the VPS. From Claude Code's perspective, it is talking to a local MCP server. From Hermes's perspective, it is receiving MCP protocol messages via stdin from an SSH session.

This is the cleanest approach I found. The alternatives — running a local Hermes proxy that connects to the VPS, or exposing Hermes on a port and using HTTP transport — both introduce more failure points. The SSH stdio bridge is two commands and a config entry.

For the SSH connection to work without prompting for credentials, set up key-based auth first:

ssh-copy-id root@your-vps-ip
# Verify it works without password:
ssh -T root@your-vps-ip echo "bridge works"

You can also use a project-level .mcp.json if you only want Hermes available in specific repositories:

# storefront/.mcp.json
{
  "mcpServers": {
    "hermes": {
      "command": "ssh",
      "args": [
        "-T",
        "root@your-vps-ip",
        "hermes mcp serve --config /root/.hermes/config.yaml"
      ],
      "description": "Hermes orchestration agent — Telegram, GitHub, cron dispatch"
    }
  }
}

The MCP Bridge

The bridge is bidirectional. Claude Code can call Hermes tools. Hermes can call Claude Code tools. Both sides are configured as MCP servers that the other agent registers as a client.

From Claude Code's perspective, Hermes appears as a set of tools. The primary ones Hermes exposes:

run_task — execute a multi-step orchestration task
send_telegram — send a message to a specific Telegram chat
query_memory — query Hermes's persistent memory store
schedule_task — add a one-off task to Hermes's cron queue
get_github_context — fetch issues, PRs, and recent commits from the configured repository

From Hermes's perspective, Claude Code appears as a set of tools. The primary ones Claude Code exposes when run as an MCP server:

edit_file — make targeted edits to an existing file
create_file — create a new file with specified content
read_file — read file content
run_bash_command — run a shell command in the project directory

Start Claude Code as an MCP server (for Hermes to call):

claude mcp serve
# Listens on stdio — Hermes spawns this process via its claude_code mcpServer config

The bidirectionality is what makes the system genuinely useful. Without it, you have two separate agents with no way to coordinate. With it, each agent can delegate to the other for tasks in the other's specialty.

One thing I learned the hard way: the connection from Hermes on the VPS to Claude Code on my local machine requires the local machine to be running and reachable. Hermes can trigger Claude Code tasks only when my MacBook is on and the SSH reverse tunnel is active. For scheduled tasks that run at 7 AM, I need either: (a) my MacBook on, (b) Claude Code also running on the VPS with VPS codebase access, or (c) Hermes falling back to doing the code work itself. I chose option (b) for cron-triggered coding tasks — a second Claude Code instance on the VPS with read/write access to the deployed code.

Six Production Patterns

Pattern 1: Phone → Hermes → Claude Code (Telegram Trigger)

The most-used pattern. I send a Telegram message describing a code change I want. Hermes receives it, decides it is a coding task, dispatches it to Claude Code via the MCP bridge, and sends me a confirmation with the result.

Example message from my phone:

Add a "last updated" timestamp to every blog post card on the /blogs listing page.
Format it as "Updated May 16, 2026". Use the published_at field. Mobile-friendly.

Hermes's processing:

1. Classify: coding task → route to Claude Code
2. Add context: attach relevant file paths from memory (blog card component location)
3. Call claude_code.run_bash_command: read current component
4. Call claude_code.edit_file: apply the timestamp addition
5. Call claude_code.run_bash_command: run tsc --noEmit to verify
6. Call git tool: commit the change to a feature branch
7. Call github tool: open a draft PR
8. Call send_telegram: "Done. PR #47 opened. Timestamp added to BlogCard.tsx."

The total elapsed time from my Telegram message to the Telegram confirmation is 90–180 seconds, depending on file size and TypeScript complexity. I am not at my computer for any of it.

Pattern 2: Cron → Hermes → Claude Code (Scheduled PR Triage)

Every morning at 7 AM IST, Hermes runs the morning-triage cron job. For issues tagged needs-fix that have been open more than 3 days, Hermes dispatches a task to Claude Code to generate a fix proposal:

# Hermes cron task (simplified)
1. List GitHub issues tagged "needs-fix" older than 3 days
2. For each issue:
   a. Read the relevant code file via claude_code.read_file
   b. Call claude_code.run_task: "Propose a fix for this issue. Read the current code and describe the change needed, then implement it in a new branch."
   c. Open a draft PR with the proposed fix
   d. Post a comment on the original issue linking to the PR
3. Send Telegram summary: "Fixed 2 issues overnight. PRs #48, #49 ready for review."

This pattern requires Claude Code to be available on the VPS (or the local machine to be on). I keep a Claude Code session running on the VPS in a tmux pane during working hours. For overnight tasks, I use the VPS Claude Code instance with access to the mirrored codebase.

Pattern 3: Claude Code → Hermes → Telegram (Completion Notification)

Claude Code can call Hermes tools directly. After completing a long implementation task, Claude Code calls hermes.send_telegram to notify me:

# Claude Code side — after completing a task
# Claude Code calls the hermes MCP tool automatically when instructed:
# "When you finish, send me a Telegram notification with a summary."

# The tool call Claude Code makes:
hermes.send_telegram({
  message: "Done: Implemented the dual-stack blog post. Files modified: 2. TypeScript: clean. Build: passing. PR branch: feat/dual-stack-blog-post."
})

This is the simplest pattern and the one I use most often during active development. I start a Claude Code task, tell it to notify me when done, and go do something else. The Telegram ping tells me when to come back.

Pattern 4: Hermes Skill → Claude Code Tool

Hermes supports skills — reusable task templates stored in ~/.hermes/skills/. A skill can reference Claude Code as a tool. Here is the blog-writer skill that I use frequently:

# ~/.hermes/skills/blog-writer.yaml
name: blog-writer
description: Write and publish a technical blog post to the WOWHOW storefront

steps:
  - name: research
    task: |
      Research the topic: ${TOPIC}
      Find 3-5 technical sources. Summarize key points and data.
      Output: research_summary (markdown)
    model: anthropic/claude-sonnet-4-6

  - name: outline
    task: |
      Create a detailed outline for a ${WORD_COUNT}-word technical blog post
      about ${TOPIC}. Use the research from the previous step.
      Include all H2 sections, key code blocks, and conclusion.
    model: anthropic/claude-sonnet-4-6

  - name: write
    tool: claude_code
    task: |
      Write the full blog post based on the outline. Create the TypeScript
      data file at the correct path in src/data/blog-posts/. Follow the
      exact format used in existing blog post files. Escape all template
      literal variables with backslash. Return the file path when done.

  - name: register
    tool: claude_code
    task: |
      Add the import and spread to src/data/blog-posts.ts. Add the slug
      to the POST_ORDER array at position 0. Verify TypeScript compiles.

  - name: notify
    tool: send_telegram
    message: "Blog post created: ${TOPIC}. File: ${write.result}. Compile: clean."

Invoke from Telegram:

/skill blog-writer TOPIC="Hermes dual-stack architecture" WORD_COUNT=4200

The skill runs end-to-end: research, outline, write (delegated to Claude Code), register (delegated to Claude Code), notify. I get a Telegram message when it is done.

Pattern 5: Shared Knowledge Base

Hermes maintains a persistent SQLite memory store. Claude Code reads from CLAUDE.md and project files. The gap between these two knowledge sources is a real problem — Hermes knows things Claude Code does not, and vice versa.

I bridge this in two ways. First, I have a cron job that exports Hermes's memory to a markdown file that Claude Code can read:

# ~/.hermes/cron.yaml
  memory-sync:
    schedule: "0 * * * *"    # every hour
    task: |
      Export all memory entries tagged "architecture" or "decisions" to
      /root/storefront/storefront/.hermes-context.md in markdown format.
      Format: ## [tag] 
 - [key]: [value] 

    model: anthropic/claude-haiku-4-5-20251001
    max_steps: 5

Second, I add a reference to this file in CLAUDE.md:

# In storefront/CLAUDE.md, add:
## Hermes Context
Read `.hermes-context.md` for decisions and context that Hermes has recorded.
This file is auto-updated hourly by Hermes's memory-sync cron job.

The reverse direction — Claude Code writing to Hermes memory — works via the hermes.query_memory and (when available) hermes.write_memory tools. I instruct Claude Code to record important architectural decisions to Hermes memory at the end of significant tasks. This keeps Hermes's context current without manual updates.

Pattern 6: Multi-Agent Kanban with Claude Code as Worker

For larger features that span multiple sessions, I use Hermes as a Kanban board manager and Claude Code as the task worker. Hermes maintains a task list in its memory store, assigns tasks, tracks completion, and dispatches the next task when the previous one finishes.

# Hermes task queue in memory (simplified schema)
tasks:
  - id: T001
    title: "Add dual-stack blog post"
    status: done
    assigned_to: claude_code
    completed_at: "2026-05-16T22:00:00Z"

  - id: T002
    title: "Add Hermes skill index page"
    status: in_progress
    assigned_to: claude_code
    started_at: "2026-05-16T22:15:00Z"

  - id: T003
    title: "Update tools registry with 5 new tools"
    status: pending
    assigned_to: claude_code
    depends_on: [T002]

When Claude Code completes T002, it calls hermes.run_task with a task completion report. Hermes marks T002 done, checks dependencies, and dispatches T003 to Claude Code automatically. I get a Telegram update after each task completes.

This pattern is useful for late-night work — I set up the Kanban queue before sleeping and wake up to a Telegram thread showing what was completed overnight. Claude Code on the VPS handles the actual execution. Hermes handles sequencing and notification.

What Broke

This section is the most important one. Architectural diagrams always look cleaner than the implementation.

MCP stdio bridge latency. The SSH stdio bridge adds 200–400ms to every MCP tool call. For interactive use this is borderline acceptable. For workflows with 15+ tool calls, you notice it accumulate. A Hermes task that calls Claude Code 10 times takes 2–4 seconds just in bridge overhead before accounting for model inference time. I have not found a way to reduce this without switching to a persistent TCP connection, which introduces its own complexity.

Session state is not shared. Hermes has its SQLite memory store. Claude Code has CLAUDE.md and project context. There is no live shared state — the two agents do not see each other's current reasoning. When Hermes dispatches a task to Claude Code, Claude Code has no awareness of what Hermes is "thinking about" beyond what Hermes explicitly passes in the task description. I compensate by writing verbose task descriptions, but this is a real limitation. Long tasks sometimes lose context mid-execution because the task description did not include a fact that seemed obvious from Hermes's perspective.

Skill format mismatch. Hermes skills use Hermes's YAML format with Hermes-specific interpolation syntax. Claude Code skills use a different format. You cannot share skill definitions between the two systems — you maintain two separate skill libraries. I tried to abstract this into a shared format and gave up after a day. The formats are different enough that a translation layer would be more maintenance than two separate libraries.

Claude Code MCP server is not designed for concurrent access. When Hermes runs two cron jobs simultaneously and both dispatch tasks to Claude Code, you get race conditions. The second task waits for the first to complete (or fails with a lock error). I now stagger cron jobs by at least 5 minutes and ensure no two jobs can both dispatch to Claude Code in the same window.

SSH tunnel drops. If the SSH connection drops mid-task, the MCP bridge silently dies. Hermes does not always detect this cleanly — sometimes it retries the same tool call instead of reporting a connection failure. I added a keepalive to the SSH config and a health check to the Hermes config, but this remains the flakiest part of the setup.

# ~/.ssh/config — add for the VPS connection
Host your-vps-ip
  ServerAliveInterval 30
  ServerAliveCountMax 3
  ConnectTimeout 10

The Workflow That Stuck

After two weeks of debugging and two more weeks of daily use, one workflow pattern has become my default and I expect it to stay:

Phone-driven development.

Morning: Wake up. Open Telegram. Check the overnight Hermes summary (morning-triage cron). It tells me what is open, what needs review, and what broke. I read it on my phone before getting out of bed.

During the day: When I think of a code change while away from my computer, I send a Telegram message. Hermes routes it to the task queue. When my computer is on and Claude Code is running, the task executes. I get a confirmation when it is done. I do not need to be at the computer to initiate the work or to know when it is complete.

Evening: I review what was done. If something needs manual review, the PR is already open. I read it, request any changes via GitHub (or via Telegram → Hermes → GitHub comment), and move on.

The dual-stack did not change how I write code — Claude Code does that the same as before. What changed is the coordination layer around the coding. Scheduling, notification, context from the outside world — all of that now runs through Hermes, always-on, without me needing to be at my desk to trigger or monitor it.

The question people usually ask at this point: is it worth the setup complexity? For a solo founder running a product with 1,800+ items, a blog pipeline, and daily SEO and coding work — yes. If you are doing weekend projects or one-off builds, probably not. The setup cost is three to four hours of real work. The maintenance burden is low once it is running. The payoff is a development loop that runs while I sleep and notifies me on my phone.

Full Config Summary

All the config files in one place for reference:

# VPS: ~/.hermes/config.yaml — Hermes orchestrator
# VPS: ~/.hermes/cron.yaml — Scheduled jobs
# VPS: /etc/systemd/system/hermes.service — Auto-start
# Local: ~/.claude.json — Claude Code global MCP registration
# Local: storefront/.mcp.json — Project-level MCP registration (optional)
# Local: ~/.ssh/config — SSH keepalive settings
# Shared: storefront/.hermes-context.md — Hourly memory sync output

The minimum viable dual-stack is three files: ~/.hermes/config.yaml on the VPS with the claude_code mcpServer entry, ~/.claude.json on your local machine with the hermes mcpServer entry using SSH transport, and SSH key auth set up between the two machines. Everything else — cron, Telegram, memory sync, skills — layers on top of that foundation.

Start with the minimum. Get the bridge working. Run a test task end-to-end. Then add the pieces that match your specific workflow.

Originally published at wowhow.cloud

推荐订阅源

DEV Community