惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

宝玉的分享
宝玉的分享
WordPress大学
WordPress大学
博客园 - 司徒正美
美团技术团队
酷 壳 – CoolShell
酷 壳 – CoolShell
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
小众软件
小众软件
量子位
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
有赞技术团队
有赞技术团队
博客园 - 【当耐特】
博客园 - Franky
Jina AI
Jina AI
人人都是产品经理
人人都是产品经理
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
T
Threat Research - Cisco Blogs
D
Darknet – Hacking Tools, Hacker News & Cyber Security
F
Fox-IT International blog
T
ThreatConnect
A
Arctic Wolf
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Last Week in AI
Last Week in AI
C
CERT Recently Published Vulnerability Notes
P
Palo Alto Networks Blog
李成银的技术随笔
Project Zero
Project Zero
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
F
Full Disclosure
H
Hacker News: Front Page
雷峰网
雷峰网
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
S
SegmentFault 最新的问题
S
Schneier on Security
T
Tor Project blog
博客园_首页
月光博客
月光博客
大猫的无限游戏
大猫的无限游戏
博客园 - 聂微东
S
Securelist
C
Comments on: Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Attack and Defense Labs
Attack and Defense Labs
IT之家
IT之家
博客园 - 叶小钗
J
Java Code Geeks
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events

DEV Community

Microsoft Just Shipped MCP Governance for .NET. Here's What It Actually Enforces. How I Built a Pakistan Internet Speed Test Platform at 16 I Built My Own Corner of the Internet — Here's What It Looks Like How does VuReact compile Vue 3's defineExpose() to React? Neo-VECTR's Rift Ascent Idempotency Keys: The API Safety Net You Probably Aren't Using Building E-Commerce Sites for Niche Products: Technical Lessons from Specialty Outdoor Retailers Audit Logs: The Silent Guardian of Every Serious System Open-source SDS tooling for Japanese MHLW compliance: the gap nobody filled BetAGracevI I Built a Post-Quantum Cryptographic Identity SDK for AI Agents — Here's Why It Needs to Exist Running Claude Code across multiple repos without losing context There Are Cameras in Every Room of My House. I Put Them There. Why your AI agent loops forever (and how to break the cycle) How does VuReact compile Vue 3's defineSlots() to React? Building a Privacy-First Resume Editor with Typst WASM and React One Soul, Any Model: Portable Memory for Open-Source Agents with .klickd From Pixels to Prescriptions: Building an Autonomous Healthcare Booking Agent with LangGraph MonoGame - A Game Engine for Those Who Love Reinventing the Wheel # Day 24: In Solana, Everything is an Account Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests Mastering Node.js HTTP Module: Build Servers, REST APIs, and Handle Requests RP2040 Wristwatch Tells Time With a Vintage VU Meter Needle observations about models / 2026, may From Video Transcripts to Source-Grounded AI Notes: A Practical Look at Notesnip AI Agent Dev Environment Guide — Real Experience from an AI Living Inside a Server How I Run 7 AI Models 24/7: Multi-Agent Architecture in Practice What exactly changes with the Claude Max plan? I Revived a Broken MLOps Platform — Now It's Self-Service, Policy-Guarded, and Operationally Credible OpenAI's $2M-tokens-for-equity YC deal, decoded Why DMX Infrastructure is Still Stuck in the 90s Agent Series (2): ReAct — The Most Important Agent Reasoning Paradigm Open Source Project (No.73): Sub2API - All-in-One Claude/OpenAI/Gemini Subscription-to-API Relay I Made the Wrong Bet on Event Streaming in Our Treasure Hunt Engine #ai #productivity #chatgpt #python Symbolic Constant Conundrum From Manual RAG to Real Retrieval — Embedding-Based RAG with NVIDIA NIM Building an outbound-only WebSocket bridge for local AI agents Our System's Sins in Ghana: Why We Had to Rethink Digital Product Sales Execution Governance, AI Drift, and the Security Paradox of Runtime Enforcement Differential Pair Impedance: Why USB and HDMI Routing Is a Geometry Problem Small AI database questions can become big scans Claude Code 2.1 Agent View & /goal: Autonomous Dev Guide 2026 Your AI database agent should not see every column Rust's Low-Latency Conquest: Why We Ditched C++ for a Treasure Hunt Engine Floating-point will quietly corrupt your emissions math, and 0.1 + 0.2 already warned you Autonomous Agents: what breaks first (and why that's the real product) [2026-05-23] Agent payments are the new cloud bill footgun ORA-00069 오류 원인과 해결 방법 완벽 가이드 How I Built a Local, Multimodal Gemma 4 Visual Regression & Patch Agent: Closed-Loop Validation, Canvas Pixel Diffing, and Reproducible Benchmarks Pressure-testing Ota on Supabase: from setup prose to executable repo readiness VPC CNI en EKS: cómo dejar de pagar nodos que no usás The Future of Text Analysis: Introducing TechnoHelps Semantic Engine I built a Chrome Extension that saves product images + context directly to Google Drive & Sheets 95+ browser-based dev tools that never touch a server Running Qwen 2.5 Coder 14B Locally in Cursor with Ollama From a 10,000-line OpenSearch export script to a log analysis tool Ghost Bugs Cost $40K: A Neural Debugging Postmortem SECPAC: A Lightweight CLI Tool to Password-Protect Your Environment Variables 🚀 PasteCheck v1.7 + v1.8 — Hints that tell you what to fix, and a nudge panel that tells you where to start 8 Real Ways Developers Make Money in 2026 (Ranked by Effort) I built a free AI-powered Git CLI that writes your commit messages for you sds-converter: Converting Safety Data Sheets to MHLW Standard JSON with Rust and LLMs OpenLiDARViewer: A Browser-Based LiDAR and Point-Cloud Viewer Local-First Browser Tools: What You Should Not Upload Online Why most freelancers undercharge (and the maths behind fixing it) We built a mahjong dangerous-tile predictor calibrated on 4.97M real hands Building a Chord Progression Generator in the Browser — Music Theory in JS, Sound via Web Audio API tutorial #10: 148 Opens, 0 Replies — How My Forge Cold Email v1 Completely Failed 9 in 10 Docker Compose files skip the basic security flags How to Forward Android SMS to Telegram Automatically I built the first security scanner for MCP servers — here's what I found Building an Interplanetary Quantum Logic Engine in Rust/Ovie From AI Code Generation to AI System Investigation I gave Gemini 3.5 Flash a CVE-fix PR to review. It found another bug in the same file. When I Realized We Were Throwing Away Half Our Engine's Potential TokenJuice and the 20-Minute Cron: Inside OpenHuman’s Aggressive Context-Harvesting Engine CodeDNA: AI Codebase Archaeologist Built with Gemma 4 Thinking Mode Building a semantic search API in Go with Meilisearch April 2026 DigitalOcean Tutorials: Inference Optimization and AI Infrastructure Looking for DTMF transceiver module Moving Beyond "Tribal Software": Why the Singularity Demands the Interplanetary Hybrid Human Use SVGIcons as a Claude Custom Connector to Find Icons Faster DMARC Is Now a Proper Internet Standard: What Changed in RFC 9989/9990/9991 OpenTelemetry Is Now a CNCF Graduate — and It's Coming for Your AI Stack OpenHuman Follows OpenClaw’s Rise, But With an Obsidian Brain O erro mais caro em programas Solana: PDA sem bump check Build a Live Flight Radar in a Single HTML File DuckDB 1.5.3 Adds Quack Client-Server, SQLite Gets Cypher Graph Extension Custom Copilot Agents: Building Domain-Expert AI Teammates with Skills, MCP Tools, and Custom Knowledge RTX 5090 Cooling, BeeLlama VRAM Opts, Resizable BAR Performance Gains This week in Cursor + .NET — 3 rules + 4 essays (week ending May 22, 2026) RAG Architecture with n8n + PostgreSQL (pgvector) + Ollama Gemma4 on AWS EC2 Keep Your Taste I Built chanprobe Because My Go Queues Were Invisible Building a Live Solana TPS Meter with OrbitFlare's TypeScript SDK Using Gemma 4 to Analyze Bitcoin’s Next 5, 15, and 60 Minutes Security news weekly round-up - 22nd May 2026 When Stress Disguises Itself as Rational Planning (Bite-size Article) A Domain-Driven Notification Microservice — Patterns From Production
How to Build a Supervisor Agent Architecture Without Frameworks
Rafael Tedes · 2026-05-23 · via DEV Community

A few days ago, I wrote about building an agentic pipeline from scratch in pure Python. The idea was intentionally simple: receive a task, invoke tools, and generate a response.

That architecture works surprisingly well for linear workflows. But real-world AI systems become complicated much faster than most tutorials suggest.

Eventually, one agent is no longer enough. A single reasoning loop starts accumulating too many responsibilities. The prompt grows uncontrollably, tool definitions pile up, execution logic becomes tangled, and debugging turns into a nightmare. What started as a clean “AI agent” slowly becomes a monolith trying to do everything at once.

This is where the Supervisor Pattern becomes useful.

Instead of relying on one giant agent, we introduce a central orchestrator responsible for coordinating specialized executors. Some executors may be tools, others may be reusable workflows, and others may be autonomous agents focused on a specific domain.

Conceptually, this is much closer to how modern AI systems operate internally. Systems like GitHub Copilot, Claude, and many enterprise AI platforms are not simply “one prompt talking to one model.” A significant part of the engineering complexity comes from orchestration: deciding what should execute, when it should execute, and how results should be combined.

In this article, we will build a simplified supervisor-based multi-agent architecture entirely in pure Python without relying on orchestration frameworks.


From Single Agents to Execution Runtimes

Most beginner agent tutorials follow roughly the same structure. A user sends a task, the agent reasons about it, invokes tools when necessary, and eventually produces an answer.

At small scale, this works well enough.

But imagine a more realistic request:

Research vector databases,
generate implementation examples,
analyze project files,
and review the generated solution.

Enter fullscreen mode Exit fullscreen mode

A single agent now has to:

  • reason about research,
  • inspect files,
  • generate code,
  • review output,
  • manage tools,
  • maintain context,
  • and orchestrate execution order.

The problem is not necessarily model capability. The problem is architecture.

At some point, the “agent” stops being just a reasoning unit and accidentally becomes a workflow engine.

The Supervisor Pattern solves this by separating orchestration from execution.

Instead of one overloaded agent, we create specialized executors coordinated by a supervisor.

The architecture looks like this:

                    Supervisor
                         |
        +----------------+----------------+
        |                |                |
     Tools            Skills           Agents
        |                |                |
   File Search      Ticket Workflow   Coding Agent
   Web Search       Log Analysis      Research Agent

Enter fullscreen mode Exit fullscreen mode

The important idea here is that the supervisor does not care whether it is invoking a tool, a workflow, or another agent. Everything follows the same execution contract.


Creating a Common Execution Interface

One of the simplest but most important architectural decisions is standardizing how execution works.

Instead of creating separate orchestration logic for tools, agents, and workflows, we can define a shared interface:

from abc import ABC, abstractmethod

class Executable(ABC):

    @abstractmethod
    async def execute(self, task: str, context: dict):
        pass

Enter fullscreen mode Exit fullscreen mode

This abstraction becomes surprisingly powerful.

A search tool can implement it. A coding agent can implement it. A reusable workflow can implement it. To the supervisor, everything becomes just another executable component.

That greatly simplifies orchestration.


Building Specialized Executors

Now let’s create a few executors with different responsibilities.

We will start with a simple search tool:

class SearchTool(Executable):

    async def execute(self, task, context):

        return f"""
        Searching documentation for:
        {task}
        """

Enter fullscreen mode Exit fullscreen mode

Then a file analysis tool:

class FileAnalysisTool(Executable):

    async def execute(self, task, context):

        return f"""
        Analyzing project files for:
        {task}
        """

Enter fullscreen mode Exit fullscreen mode

These tools are intentionally small and focused. They represent atomic capabilities.

Now we can create specialized agents.

A research agent:

class ResearchAgent(Executable):

    async def execute(self, task, context):

        return f"""
        Research findings for:
        {task}
        """

Enter fullscreen mode Exit fullscreen mode

A coding agent:

class CodingAgent(Executable):

    async def execute(self, task, context):

        return f"""
        Generated implementation for:
        {task}
        """

Enter fullscreen mode Exit fullscreen mode

And finally, a review agent:

class ReviewAgent(Executable):

    async def execute(self, task, context):

        return f"""
        Code review completed for:
        {task}
        """

Enter fullscreen mode Exit fullscreen mode

This separation of concerns is one of the biggest advantages of supervisor architectures. Each executor remains isolated and focused, which makes the overall system easier to scale and debug.


Dynamic Registration

Hardcoding dependencies quickly becomes painful as the number of executors grows.

Instead, we can create a registry capable of dynamically storing and discovering executors at runtime.

class Registry:

    def __init__(self):

        self.executables = {}

    def register(
        self,
        name,
        description,
        executable_type,
        instance
    ):

        self.executables[name] = {
            "description": description,
            "type": executable_type,
            "instance": instance
        }

    def get(self, name):

        return self.executables[name]

    def list(self):

        return self.executables

Enter fullscreen mode Exit fullscreen mode

Now we can register everything dynamically:

registry = Registry()

registry.register(
    "search_tool",
    "Searches technical documentation",
    "tool",
    SearchTool()
)

registry.register(
    "file_analysis_tool",
    "Analyzes project files",
    "tool",
    FileAnalysisTool()
)

registry.register(
    "research_agent",
    "Performs research tasks",
    "agent",
    ResearchAgent()
)

registry.register(
    "coding_agent",
    "Generates code",
    "agent",
    CodingAgent()
)

registry.register(
    "review_agent",
    "Reviews implementations",
    "agent",
    ReviewAgent()
)

Enter fullscreen mode Exit fullscreen mode

At this point, the system starts feeling much more like a runtime instead of a simple chatbot wrapper.


Building the Supervisor

The supervisor is the heart of the architecture.

Its responsibility is not necessarily to solve the task directly, but rather to decide which executors should participate in solving it.

We will start with a very simple planner:

class Supervisor:

    def __init__(self, registry):

        self.registry = registry

    async def plan(self, task):

        selected = []

        lower_task = task.lower()

        if "research" in lower_task:
            selected.append("research_agent")

        if "implement" in lower_task:
            selected.append("coding_agent")

        if "review" in lower_task:
            selected.append("review_agent")

        if "search" in lower_task:
            selected.append("search_tool")

        return selected

Enter fullscreen mode Exit fullscreen mode

This planner is intentionally primitive. In production systems, this planning phase is often delegated to an LLM that returns structured execution plans.

But even this simplified version already demonstrates the architecture.

The supervisor receives a goal and decides dynamically which executors should participate.


Parallel Execution

Now comes the most interesting part.

Instead of executing everything sequentially, the supervisor can orchestrate independent tasks concurrently.

import asyncio

class Supervisor:

    def __init__(self, registry):

        self.registry = registry

    async def plan(self, task):

        selected = []

        lower_task = task.lower()

        if "research" in lower_task:
            selected.append("research_agent")

        if "implement" in lower_task:
            selected.append("coding_agent")

        if "review" in lower_task:
            selected.append("review_agent")

        if "search" in lower_task:
            selected.append("search_tool")

        return selected

    async def execute(self, task, context):

        selected = await self.plan(task)

        executions = []

        for name in selected:

            executable = self.registry.get(name)["instance"]

            executions.append(
                executable.execute(task, context)
            )

        results = await asyncio.gather(*executions)

        return results

Enter fullscreen mode Exit fullscreen mode

This changes the nature of the system completely.

We are no longer building a linear tool-calling loop. We are building an orchestration runtime capable of coordinating distributed execution.

That distinction matters a lot.


Running the System

Now we can execute the entire pipeline:

async def main():

    supervisor = Supervisor(registry)

    result = await supervisor.execute(
        """
        Research vector databases,
        search implementation examples,
        implement a prototype,
        and review the generated code
        """,
        {}
    )

    for item in result:
        print(item)

asyncio.run(main())

Enter fullscreen mode Exit fullscreen mode

The interesting part is not the mock outputs themselves. The interesting part is the orchestration model emerging underneath.

The supervisor analyzes the task, dynamically selects executors, parallelizes execution, and aggregates results back into a unified workflow.

That is much closer to how modern AI systems actually operate internally.


The Architectural Shift

At this point, the system has evolved far beyond a simple “AI chatbot.”

The supervisor is acting simultaneously as:

  • planner,
  • router,
  • scheduler,
  • orchestrator.

This is also why many production AI systems are significantly more complicated than “send prompt, receive response.”

A large portion of the engineering complexity comes from:

  • orchestration,
  • execution management,
  • concurrency,
  • state propagation,
  • retries,
  • failure isolation,
  • observability.

The model itself is only one piece of the system.


Moving Toward LLM-Based Planning

Our planner currently uses simple rule matching:

if "review" in task:
    selected.append("review_agent")

Enter fullscreen mode Exit fullscreen mode

But modern systems usually replace this with an LLM planner capable of generating structured execution plans.

Something like:

prompt = f"""
Task:
{task}

Available executors:
- research_agent
- coding_agent
- review_agent
- search_tool

Return the executors required.
"""

Enter fullscreen mode Exit fullscreen mode

The LLM might return:

{
  "executors": [
    "research_agent",
    "coding_agent",
    "review_agent"
  ],
  "parallel": true
}

Enter fullscreen mode Exit fullscreen mode

At that point, the runtime becomes significantly more autonomous.

The supervisor is no longer following hardcoded execution paths. It is dynamically constructing execution graphs at runtime.


Production Realities

This is where things become genuinely difficult.

Once agents can invoke tools, workflows, and even other agents,
the runtime itself becomes the primary engineering challenge.

You suddenly need to think about:

  • recursion protection,
  • concurrency limits,
  • cancellation,
  • retries,
  • structured outputs,
  • tracing,
  • execution graphs,
  • timeout management.

For example, agents invoking agents can accidentally create infinite loops:

Supervisor -> ResearchAgent
ResearchAgent -> Supervisor
Supervisor -> ResearchAgent

Enter fullscreen mode Exit fullscreen mode

You quickly realize that the difficult part of AI systems is often not model invocation.

The difficult part is building reliable orchestration around the model.


Final Thoughts

Once you understand the Supervisor Pattern, you stop thinking about AI agents as isolated chatbots.

You start thinking in terms of execution runtimes, orchestration graphs, distributed reasoning, and autonomous workflows.

That shift in perspective changes everything.

And interestingly, none of this requires a framework.

Underneath most orchestration libraries, the core execution model is still surprisingly simple:

await executable.execute(task, context)

Enter fullscreen mode Exit fullscreen mode

Everything else is architecture layered on top.