惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

DEV Community

Building an effective Storyblok Tool Plugin with SvelteKit How to Get Your Renault / Dacia Radio Code for Free RAG 시스템 실전 구축 (v39) Retraction — scrml’s Living Compiler I built a fitness app where the AI roasts you for eating pizza (and hypes you when you PR) The Top SaaS Founder Communities on Discord (Beyond the AI Hype) I Built a Production-Grade Async Job Queue from Scratch — Here's Everything That Actually Happened How to watch SMS from multiple Android phones in one iOS app Multi-tenant além do TenantId: problemas reais e aprendizados em sistemas .NET After failing 23 times, I am sharing How I Actually Prepare for a Tech Interview Every Single Time Now. I built an app that works like a nutritionist for your brain. Here's what happened in 7 days. GoBadge Dynamic: From Module Stats to Universal Badges LangGraph 워크플로우 템플릿 (v39) The git Commands You Forgot Exist (And Why AI Workflows Make Them Relevant Again) Six Levels of MCP Servers One container to replace Grafana + Loki + Tempo + Prometheus The Request/Response Cycle, HTTP, Auth, JWT, OAuth & Sessions — Explained Properly Python Week 3: We Stopped Repeating Ourselves (Loops!) Creating a Custom Grid Editor tool in Unreal Engine 我做了个付费 Telegram bot。Telegram Stars 实际给开发者多少钱,我算了一笔账。 I Got 96% Recall on LLM Hallucination Detection With No ML Model – Just 50 Lines of Python A practitioner's guide to getting more value out of AI coding: agent quality & token optimization How to Handle Telegram Albums in Telegraf I Built a Multilingual Spam Detection Dataset with 149K+ Messages Across 23 Languages How to Handle Telegram Albums in grammY RAG 시스템 실전 구축 (v38) Beyond Pip Install: Why Your AI Agent Needs a "Hermetic" Life-Support System to Survive Resume Building using HTML & CSS SpecFlow: Multi-Agent SDD in Cursor (4 phases, /approve, single code writer) Running ASR for smart homes in the NPU of Intel processors "Building a CI/CD Pipeline From Scratch: A Practical Guide for Developers (with GitHub Actions)" SpecFlow: SDD multi-agente en Cursor (4 fases, /approve, un solo escritor de código) How to Extract Your Full Team Hierarchy from HubSpot (the API doesn't expose it) Adobe Commerce Cloud now costs $40k/year. We migrated from Adobe Commerce to Magento Open Source — here's the honest breakdown .klickd v4.0.0 — Portable AI memory with constraints, strict schemas, and test vectors We Trust Third Party Code, It’s Time to Trust AI Generated Code LangGraph 워크플로우 템플릿 (v38) Sustainable AI Starts with Efficient AI Find Remove duplicated files in Google Drive How to Detect GPU Waste in a Kubernetes Cluster The Privacy Bug in My First Chrome Extension (And How to Avoid It) Serverless Mental Models: What They Don't Tell You Before You Build Preventing GPT hallucination in automated content pipelines: how I structure Make.com flows with data injection Hmm, where were we? AI Visibility Tools, Math Proofs, and Stripped Guardrails Shape Developer Landscape How AI and Electronics Are Changing Healthcare Devices: The Future of Smart Healthcare Author: Shivam Wakade | Founder, PrivSR Making Claude Sound Like Optimus Prime Understanding Reinforcement Learning with Human Feedback Part 5: Training the Reward Model with Loss Functions Learning Progress Pt.20 How Secure LoRa Communication Devices Work: Building the Future of Private and Long-Range Connectivity Author: Shivam Wakade | Founder, PrivSR How I Rebuilt an RPG Map Editor with Rust, React, and WASM Building a System That Automates YouTube Post-Production Building a 100% Serverless Digital Asset Packager in the Browser Game Recommended AI What is Human-In-The-Loop (HITL)? Deep Dive: React Server Components in TanStack Start Migrating off Google Analytics: Umami vs Plausible vs Fathom Building a Portfolio That Actually Demonstrates Software Engineering Async/Await in JavaScript: From Callbacks to Clean Code (2026) Benchmarking LLM Structured Outputs Angular 21 Multiselect Dropdown: A Migration-Friendly Component with Live Functional Tests ShareBox v5 — GPU transcoding, Netflix-style grid, and why I don't need Plex anymore TOML Schema is live Handling Duplicate Shopify Webhook Events (And Why You Must) Original Kubernetes Dashboard — retired upstream, upgraded to Angular 21. لماذا أسست ترينافو للتجار العرب الذين تتجاهلهم المنصات الغربية Construyendo un recomendador de películas en Python: de los datos al modelo When APIs Lie: A Lesson in Defensive Debugging Pope Leo XIV's AI Encyclical: What Builders Must Know (2026) Donna v0.3.0 HTB — MonitorsFour | Writeup The Free Tool You Trust Is the One You Should Fear the Most HTB — MonitorsFour | Writeup Fr 97. Embeddings and Vector Search: Semantic Search That Works Deep Dive: Building "Gravity Paint" - A Tactile Physics Instrument with React, Matter.js, and p5.js ABAP Unit Testing with Test Doubles and Mocking Frameworks: A Senior Architects Guide to Isolating Dependencies in SAP S/4HANA LeetCode Solution: 5. Longest Palindromic Substring kovax-react 0.8: Tailwind v4 preset, FormField adapters, ColorModeScript, and Storybook I built an AI résumé tool that refuses to lie about your experience The hat Azure Entra ID User & Role Management — Step-by-Step Practical Guide With A Simple Excercise The AI-Native Company: How a Single Founder Can Build Global Organizations Powered by AWS and an Ecosystem of Artificial Intelligences Building a Lightweight Remote MCP Knowledge Base on Cloudflare Workers Why I built Trinavo for the MENA merchants Western platforms ignore The N+1 Query That Killed Our Database, And How I Fixed It Docstrings vs Markdown Docs: What Should Developers Actually Write? Training Data Provenance: The Manifest Diff That Explains the Hash Add SVGIcons MCP to Claude Code and Find SVG Icons from Your Terminal 3 CLI Tools You Can Buy with Crypto — No KYC, No Subscriptions COSS Weekly: OpenClaw competitor NanoClaw Raises $12M, Dust Raises $40M, Sonar Acquires Gitar, and more How to know if you actually need mobile proxies (without buying any) Building Cursor for Community: A Buildathon Built on Time Pressure How we built a PII masking layer for LLM APIs — local detection, reversible tokens, one line to integrate Why MLFQ Was Way Ahead of Its Time Add Runtime Limits to Claude Agent Workflows I Built a Prompt Injection Detector with 98% Recall on Unseen Attacks. Here's Why Data Beat Architecture. 8 Vite Config Options Every Developer Should Know (Vite 8) Feature Flags That Forgot to Leave Why Trust Infrastructure Is Becoming the Hidden Layer of Donation Platforms XyPriss: Rethinking Core Performance and Zero-Trust Architecture in Modern Backends
We Didn’t Want Another AI Wrapper — So We Explored a High-Speed Hermes Orchestrator for Engineering Crews
Apurba Singh · 2026-05-26 · via DEV Community

This is a submission for the Hermes Agent Challenge

Our goal was not to build another AI wrapper, but to explore how Hermes Agent behaves as a persistent orchestration layer coordinating specialized autonomous workers inside real engineering governance workflows.


Most AI systems today are still fundamentally single-threaded assistants wrapped inside nicer interfaces.
You type a prompt, the model responds, and the workflow ends there.

But our problem was different.

Over the last few years we worked closely with alumni groups, business operators, SaaS platforms, and community engineering teams. One recurring issue appeared everywhere:

People did not simply want AI-generated text.
They wanted workflow intelligence.

They wanted systems capable of:

  • coordinating technical tasks,
  • evaluating operational risks,
  • planning execution flows,
  • synthesizing structured engineering decisions,
  • and operating reliably across multiple autonomous workers.

That realization eventually led us toward Hermes Agent.

Not because we wanted another chatbot.

But because we wanted to explore orchestration.


The Core Idea

We started asking ourselves a simple question:

What happens when Hermes stops behaving like a conversational assistant and starts behaving like a managerial orchestration layer?

That question became the foundation of our experiment.

The result was Gotihub Hermes Crew.

The name itself carries the philosophy behind the project.

Gotihub is derived from the Bengali word Goti (গতি), meaning Speed.

We wanted to explore whether autonomous engineering workers could coordinate quickly, reliably, and structurally inside real governance workflows.

The result became a high-speed multi-agent engineering orchestration system capable of analyzing GitHub repositories through specialized autonomous workers coordinated by Hermes.


Project Links

Live Demo

https://crew.gotihub.com

GitHub Repository

https://github.com/apurba-labs/gotihub-hermes-crew


Why We Didn’t Want a Single Monolithic Agent

One massive prompt window handling:

  • security analysis,
  • architecture auditing,
  • roadmap planning,
  • and executive synthesis

quickly becomes expensive, unstable, and difficult to govern.

So instead of forcing one model to think about everything simultaneously, we separated:

Execution from Governance

Execution Layer

Specialized Gemma workers execute focused engineering tasks independently.

Governance Layer

Hermes coordinates, synthesizes, and manages the outputs generated by those workers.

That separation became the most important architectural decision in the project.


The Multi-Agent Architecture

Our orchestration pipeline follows four major stages:

  1. SecurityAgent performs repository security analysis.
  2. ArchitectureAgent evaluates structural and maintainability health.
  3. PlanningAgent generates engineering roadmap recommendations.
  4. Hermes Master synthesizes everything into a structured managerial report.

The important detail is that the first stage executes concurrently.

We intentionally used Python’s native asynchronous execution model instead of sequential blocking pipelines.


Stage 1 Concurrency with asyncio.gather

The first orchestration layer launches multiple specialized workers simultaneously:

  • SecurityAgent
  • ArchitectureAgent

Both execute inside an asyncio.gather() orchestration block.

This allowed us to explore:

  • concurrent repository analysis,
  • isolated engineering responsibilities,
  • and structured task specialization.

Instead of treating AI as a single giant context window, we treated it like a coordinated engineering crew.


System Workflow Architecture

Here is the orchestration workflow powering the system:

Hermes Workflow

The workflow is intentionally separated into:

  • concurrent execution,
  • planning synthesis,
  • and executive orchestration.

This structure allowed us to keep responsibilities isolated while still producing a consolidated engineering report.


Hermes as the Orchestrator

This is where Hermes became genuinely interesting.

Hermes does not directly parse raw repositories in our architecture.

Instead, Hermes behaves like a managerial synthesis layer.

The worker agents generate:

  • summaries,
  • issue reports,
  • confidence scores,
  • engineering recommendations.

Hermes then:

  • resolves overlap,
  • synthesizes cross-agent conclusions,
  • generates executive summaries,
  • and produces structured JSON outputs.

In other words:

The workers execute.
Hermes governs.

That orchestration philosophy changed how we approached agent systems entirely.


Multi-Subdomain Infrastructure Design

As the system evolved, we realized orchestration architecture alone was not enough.

We also needed infrastructure separation.

So we deployed the ecosystem using multiple subdomains and isolated routing layers:

  • gotihub.com → corporate site
  • agl.gotihub.com → SaaS engine
  • crew.gotihub.com → Hermes orchestration platform

Behind the scenes:

  • FastAPI handled orchestration,
  • Docker managed runtime isolation,
  • Nginx routed ingress traffic,
  • Ollama powered local inference,
  • and Hermes coordinated the synthesis layer.

Most importantly:

The inference backbone was never exposed directly to the public internet.


Internal AI Backbone Architecture

The deployment topology evolved into something closer to a lightweight orchestration mesh:

Infrastructure Architecture

This allowed multiple services to share:

  • one centralized inference core,
  • isolated application routing,
  • and internal-only AI communication.

Real Engineering Problems We Hit

This project was not smooth.

And honestly, that’s where most of the learning happened.


The Local Compute Bottleneck

Our earliest orchestration runs were extremely slow.

One real telemetry session looked like this:

[TELEMETRY] GitHubLoader fetched 8 files in 5.91 seconds.

[Orchestrator] Starting Full Pipeline...
[TELEMETRY] Stage 1 took 218.68 seconds.
[TELEMETRY] Stage 2 took 72.19 seconds.
[TELEMETRY] Stage 3 took 120.18 seconds.

[TELEMETRY] Pipeline Complete! Total Runtime: 411.05 seconds.

Enter fullscreen mode Exit fullscreen mode

The bottleneck was not orchestration.

It was:

  • oversized repository context,
  • local inference latency,
  • verbose prompt chains,
  • and massive token generation overhead.

That distinction mattered.

Because it meant the architecture itself was scalable — but inference strategy needed optimization.


What We Optimized

We eventually began improving runtime by:

  • reducing repository context size,
  • prioritizing critical engineering files,
  • limiting unnecessary token generation,
  • shrinking synthesis payloads,
  • and improving async orchestration boundaries.

The system became dramatically more stable once we stopped treating every file equally.


Defensive Failure Engineering

One of the most important lessons came from structured output failures.

Large orchestration chains occasionally returned:

  • malformed JSON,
  • partial synthesis blocks,
  • or incomplete manager responses.

Instead of allowing pipeline collapse, we added:

  • fallback execution paths,
  • JSON cleanup layers,
  • defensive parsing,
  • and structured failure recovery.

That forced us to think less like prompt engineers and more like systems engineers.


Why Hermes Actually Worked Well

Frameworks like CrewAI are excellent for rapidly assembling conversational agent pipelines.

But our exploration focused on something slightly different:

  • persistent orchestration,
  • structured engineering outputs,
  • governance-oriented workflows,
  • and isolated worker responsibilities.

We wanted Hermes to operate less like a conversational assistant and more like an engineering coordination layer.

That distinction became the entire philosophy behind the project.


What Fascinated Us Most

The most interesting part was not whether AI could generate text.

It was whether autonomous workers could coordinate reliably inside real operational systems.

That changes the conversation entirely.

Instead of asking:

“Can AI answer questions?”

We started asking:

“Can AI workers collaborate responsibly inside engineering governance workflows?”

Hermes gave us a practical way to explore that future.

And honestly, that exploration became far more valuable than simply building another AI wrapper.


Built With

  • Hermes Agent
  • FastAPI
  • Python AsyncIO
  • Ollama
  • Gemma 3
  • Docker
  • Nginx
  • SQLite
  • Next.js

Final Thoughts

This project is still evolving.

We are actively optimizing:

  • orchestration runtime,
  • inference efficiency,
  • streaming telemetry,
  • structured synthesis,
  • and governance reliability.

But the biggest thing we learned was this:

Autonomous systems become genuinely interesting when they stop behaving like isolated chatbots and start behaving like coordinated engineering workers.

That is the future we wanted to explore with Hermes.

And we are excited to continue building toward it.