惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

N
News and Events Feed by Topic
Malwarebytes
Malwarebytes
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cybersecurity and Infrastructure Security Agency CISA
F
Future of Privacy Forum
C
Cisco Blogs
T
The Exploit Database - CXSecurity.com
A
Arctic Wolf
S
Securelist
K
Kaspersky official blog
S
Schneier on Security
T
ThreatConnect
T
Tenable Blog
Spread Privacy
Spread Privacy
T
True Tiger Recordings
AWS News Blog
AWS News Blog
F
Fox-IT International blog
量子位
T
Threatpost
V
Vulnerabilities – Threatpost
C
CERT Recently Published Vulnerability Notes
Cisco Talos Blog
Cisco Talos Blog
GbyAI
GbyAI
宝玉的分享
宝玉的分享
腾讯CDC
G
Google Developers Blog
aimingoo的专栏
aimingoo的专栏
Cyberwarzone
Cyberwarzone
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
U
Unit 42
雷峰网
雷峰网
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Simon Willison's Weblog
Simon Willison's Weblog
O
OpenAI News
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
The GitHub Blog
The GitHub Blog
The Register - Security
The Register - Security
MyScale Blog
MyScale Blog
小众软件
小众软件
A
About on SuperTechFans
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
博客园 - 三生石上(FineUI控件)
美团技术团队
Google Online Security Blog
Google Online Security Blog
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog

DEV Community

File Types (Regular, Directory, Link, Device, Socket, Pipe) From Arduino IDE to AVR GCC | AVR Bare Metal #1 Using Bitcoin as collateral without wrapping it: the design of a BTC collateral vault Unreal Engine 5 Skill System Architecture using GAS and GameplayTags Thoughts on Codingame 2026 Spring challenge OUT WITH THE OLD IN WITH THE NEW Why are simple 1099 tax calculators online so horribly bloated? So I built my own "Why You're Not Getting Callbacks (It's Not Your Skills)" # How I Built a Retail Demand Forecasting App with Python and Streamlit Why We Deliberately Crush Lithium Batteries (UN38.3 Crush Testing Explained) Command History & Completion The Three-Body Problem: AI Code, Supply Chain Attacks, and the Talent Exodus 로컬 LLM 셋업 가이드 (v27) Building Better .NET Worker Services with Cursor Rules Generate Professional PDF Invoices via REST API — JSON In, PDF Out Redis: Big Keys Destroem o Desempenho Compartilhado Agentic AI for Cybersecurity: Autonomous Threat Detection and Response How to Automate Android Without Appium Cron vs systemd daemon: which one for Node.js? Designing XSLT transforms with parameters and multiple inputs I Downloaded Gemma4:e2b On My Macbook in 2 steps Building an Autonomous SRE Agent: From Raw Telemetry to Safe, AI-Driven Remediation The EU AI Act in 2026: Reading the Law After the Omnibus I had zero coding knowledge. Here is "RetroTube", a 2010 YouTube sandbox prototype I built using AI! How to Validate Environment Variables in TypeScript (and Why You Should) I Built a CLI Tool That Writes Better Git Commits Than I Do Transfer Fees, Metadata, and Soulbound Tokens: My First Real Token Experiments on Solana Stop Using Fetch() in React: A Better Way To Call Your Backend Creando un Tetris con JavaScript VI: Complicando el juego. DeepSeek's API Price Cut Changed My Claude Code and ChatGPT Math [Boost] Perl 🐪 Weekly #774 - Perl is too HOT How to Track AI Usage Without Losing Revenue (Complete Guide) 77 Rules Later: What Graduating Our First Stack Actually Looked Like RAG 시스템 실전 구축 (v26) When Premature Scaling Leads to Operator Burnout Multi-Repo Microservice Changes Are a Coordination Problem. I Solved It With AI Agent Teams. The Next Frontier: How Multi-Agent Systems are Redefining Productivity The Kimwolf Bust Just Outed Android Webcams as Botnet Fodder — Here's the Question Every Repurposed-Phone Camera Setup Has to Answer I'm an autonomous AI agent. I shipped 18 fixes to myself in one session. Building a Secure Future with Zero Trust Security Architecture Asynchronous Functions in Dart How I migrated magic-link login from Resend to AWS SES + Lambda five days before launch Edge Computing He creado una empresa ficticia IT/OT para poder encontrar sus vulnerabilidades y reforzar su seguridad en sus activos críticos Why I Built @editora/react I built a tiny UGC script generator because hooks are the hardest part The Phone Is Becoming the New Terminal Why Most AI Music Tools Feel Wrong to Developers Goroutines vs. Promises: Why Go and JavaScript Look at Concurrency Completely Differently How I Use Antigravity 2.0 to Navigate Open-Source Codebases and Make Better Technical Decisions Understanding Basic HTML & CSS Concepts for Beginners Go Error Handling: Annoying or Awesome? Your To-Do List Doesn't Know You — So I Gave Mine Three Brains Shell Basics (Bash, Zsh, Sh) Free MongoDB GUI Tool for Developers, Students, and Teams Designing High-Performance Blockchain Indexers Choosing Models for an Agentic Chat App on Amazon Bedrock How Smart Growth Teams Automate Their Marketing Stack in 2026 (Without Hiring More People) What I Learned About Memory-Augmented AI Agents Seven Docker Tips Every Engineer Should Know (from Docker Captains) Welcome to the Fast-Food Era of Testing: Over-Weight by Tests How to use Claude in vscode? Prompt Engineering for Automated Evaluation: Making LLMs the Judge in AI Builder Solutions Full Stack Projects Are Not Enough Anymore Virtualization & Cloud Basics Orakle: Turning Raw Blockchain Data into Intelligence with Gemma 4 Building an Autoposting Pipeline with Hermes Agent: Why Waterfall Beats Parallel, and the Edge Cases Nobody Talks About OpenShift Virtualization Migration Advisor — Local-First, Powered by Gemma 4 26B MoE WebMCP is coming — so I’m building webmcp.js I Disappeared for 4 Months After Launch - Here's What Brought Me Back Jira Is Turing-Complete (And You've Been Coding in It) NyayAI: Building an AI Legal Assistant for 1.4 Billion People — A Technical Deep Dive E-commerce Order Automation: Stripe + Invoice + Shipping Workflow How to Evaluate AI Agents: LLM-as-Judge Tutorial The Interview Prep Stack I Used as a Senior Software Engineer Targeting Big Tech Gemma4 Challenge OptiLearn - Powered by Google Gemma 4 Aura — The Gemma 4 Powered Agentic Web Copilot & Self-Healing Accessibility Engine I built a tool that catches misleading charts using Gemma 4 running locally Worklog companion with Gemma4 GBase: Building LLM Agents That Actually Learn from Their Mistakes Blossom — a small step toward student mental wellbeing WordPress Performance Monitoring: A Complete Guide Principal Components in TypeScript (Part 4) When three sharp wallets agree: what consensus signals on Polymarket actually mean I Built a Fail-Fast Rust Scheduler with Background OAuth Auto-Refresh (Part 2) Sharing is caring How Putting Faces (Literally) to My AI Garden Images Gave It a Personality Sofi Log #001: Thailand's Tourism Tax & the 180-Day AI Surveillance Wall Sofi Log #006: Decentralized IP-Address Obfuscation Specs Sofi Log #008: Bypassing Legacy Cross-Border Bank Fee Traps Secret Rotation Automation: The Operational Cost of Security Sofi Log #009: Portable Identity & DID Passport Framework Sofi Log #011: Autonomous Smart Treasury Repatriation Specs History of Linux & Unix I asked Claude if my plan was on track for the goal — and got an honest 'No' PHPStan 'expects X, Y given' — the trace it doesn't give you Using Gemma4 2B to Assist Community Health Workers Open-source Playwright wrapper that passes bot.sannysoft.com, pixelscan, and CreepJS in headless mode
5 Things I Wish I Knew Before Building with Hermes Agent
pulkitgovran · 2026-05-25 · via DEV Community

This is a submission for the Hermes Agent Challenge: Write About Hermes Agent

I spent a week building a production system on Hermes Agent — a persistent AI memory layer for GitHub repositories called Shadow CTO. Here's what tripped me up and what I'd tell myself on day one.


1. Session IDs Are Your Most Important Design Decision

Make them meaningful from the start. A UUID is fine for experiments. For anything real, encode the domain semantics into the ID itself.

# Bad — opaque, can't debug, can't reason about isolation
session_id = str(uuid.uuid4())

# Good — self-documenting, domain-meaningful
session_id = f"repo:{owner}/{repo_name}"      # per-repository brain
session_id = f"user:{user_id}:support"         # per-user support history
session_id = f"project:{project_id}:planning"  # per-project context

Enter fullscreen mode Exit fullscreen mode

Why this matters more than you think: You cannot merge sessions. If you realize six weeks in that you gave all users the same session ID instead of per-user IDs, you are starting over. There's no migration path. The accumulated memory is gone.

Spend 30 minutes mapping your domain to session IDs before writing a single API call. It's the highest-leverage design decision in a Hermes-backed system.


2. Feed Events, Not Documents

My first instinct was to dump entire files and documents into Hermes and then query them. This works but misses the point entirely.

Hermes's memory is most powerful when you feed it events — things that happened, decisions that were made, changes that occurred over time. Not static content.

# Weaker: dumping a document
await chat(
    "Here is our entire architecture documentation: [5,000 words of context]",
    session_id="repo:acme/backend",
)

# Stronger: feeding events as they happen
await chat(
    "Decision made on 2026-03-14:\n"
    "Switched from PostgreSQL to MongoDB.\n"
    "Reason: Schema flexibility needed for dynamic user preferences.\n"
    "Impact: Migration took 3 sprints, 2 rollback incidents.",
    session_id="repo:acme/backend",
)

Enter fullscreen mode Exit fullscreen mode

The second approach lets Hermes build causal understanding. It knows what came before this decision and can reason about what changed as a result. By the 50th event, it has context that a document dump can never provide: sequence, causality, and pattern.

The rule of thumb: if the information has a timestamp and something happened, feed it as an event. If it's static reference material, RAG is the better tool.


3. Your System Prompt Is a Memory Architecture Decision

Hermes is smart, but it needs to know what to do with incoming information. A generic system prompt produces shallow retention. A structured extraction prompt produces deep, queryable memory.

# Shallow — Hermes stores raw text, answers surface questions
SYSTEM = "You are a helpful assistant."

# Deep — Hermes extracts and retains structured understanding
SYSTEM = """You are the institutional memory for {repo_name}.

When you receive new information:
1. Identify if a meaningful decision was made (skip noise like typo fixes)
2. Extract the rationale — the WHY, not just the what happened
3. Note any contradictions with previous decisions you remember
4. Remember the causal chain: what problem led to this decision

Store understanding, not transcripts. When asked later, cite specifics."""

Enter fullscreen mode Exit fullscreen mode

The quality difference in answer depth between these two system prompts is significant enough that I'd call it the second most important design decision after session ID granularity.

One additional tip: use temperature 0.2–0.3 for ingest (you want consistent, structured extraction) and 0.6–0.7 for Q&A (you want thoughtful synthesis, not mechanical retrieval).


4. Cron Jobs Need Explicit Session Context in the Prompt

This one cost me two hours of debugging. I registered Hermes cron jobs on startup without explicitly anchoring them to a session context in the prompt. The jobs fired on schedule but gave generic, unhelpful responses because they weren't connecting to the right accumulated memory.

# This job fires but has no memory context
await hermes.create_job(
    name="daily-analysis",
    schedule="0 2 * * *",
    prompt="Identify recurring failure patterns from recent engineering decisions.",
    # Where? Whose memory? Hermes doesn't know.
)

# This job fires AND draws from the right accumulated context
await hermes.create_job(
    name="acme-backend-daily-analysis",
    schedule="0 2 * * *",
    prompt=(
        "You are the Shadow CTO for acme/backend. "         # identity
        "Using the engineering decisions stored in your memory "  # explicit memory reference
        "from this repository, identify recurring failure patterns: "
        "components that keep breaking, decisions that were reversed, "
        "or technical debt accumulating. Be specific — cite titles and dates."
    ),
)

Enter fullscreen mode Exit fullscreen mode

The "You are the X for Y" clause in the prompt is what reconnects the cron job to the right session context. Without it, you're firing a prompt into a vacuum. With it, you're triggering an agent that knows who it is and what it's been watching.


5. Streaming Is Worth the Extra Code

The non-streaming endpoint is significantly simpler to implement. For any user-facing feature, use streaming anyway.

Hermes answers thoughtfully. On complex questions about accumulated history — "what are the three biggest risks before the next release?" — responses can take 10–20 seconds of generation time. Without streaming, users see a spinner and wonder if something broke. With streaming, they see the answer building in real time and the latency is invisible.

Backend (FastAPI SSE):

from fastapi.responses import StreamingResponse

@router.post("/query")
async def query(body: QueryRequest):
    async def generate():
        try:
            async for chunk in hermes.stream_chat(
                messages=build_messages(body.question),
                session_id=body.session_id,
            ):
                escaped = chunk.replace("\n", "\\n")
                yield f"data: {escaped}\n\n"
        except Exception as exc:
            yield f"data: [ERROR] {exc}\n\n"
        yield "data: [DONE]\n\n"

    return StreamingResponse(generate(), media_type="text/event-stream",
                             headers={"Cache-Control": "no-cache",
                                      "X-Accel-Buffering": "no"})

Enter fullscreen mode Exit fullscreen mode

Frontend (React):

const es = new EventSource("/api/query", { /* POST via fetch workaround */ });
es.onmessage = (e) => {
    if (e.data === "[DONE]") { es.close(); return; }
    setResponse(prev => prev + e.data.replace(/\\n/g, "\n"));
};

Enter fullscreen mode Exit fullscreen mode

The UX difference between streaming and non-streaming is the gap between a prototype and something that feels production-ready. It's two hours of extra work and worth every minute.


The One-Line Summary

Hermes rewards investment in two things: session ID design and ingest prompt structure. Get those right on day one and the persistent memory largely takes care of itself — you build the domain logic, Hermes carries the institutional knowledge.

Everything else in this list is recoverable. Those two aren't.