惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

美团技术团队
李成银的技术随笔
Last Week in AI
Last Week in AI
云风的 BLOG
云风的 BLOG
Jina AI
Jina AI
T
True Tiger Recordings
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
T
Tailwind CSS Blog
Simon Willison's Weblog
Simon Willison's Weblog
Martin Fowler
Martin Fowler
P
Proofpoint News Feed
Stack Overflow Blog
Stack Overflow Blog
E
Exploit-DB.com RSS Feed
The GitHub Blog
The GitHub Blog
爱范儿
爱范儿
PCI Perspectives
PCI Perspectives
博客园 - 叶小钗
量子位
月光博客
月光博客
O
OpenAI News
L
LINUX DO - 最新话题
S
Security Archives - TechRepublic
罗磊的独立博客
C
Comments on: Blog
B
Blog
Attack and Defense Labs
Attack and Defense Labs
Schneier on Security
Schneier on Security
MongoDB | Blog
MongoDB | Blog
Blog — PlanetScale
Blog — PlanetScale
V
V2EX - 技术
Google DeepMind News
Google DeepMind News
Hacker News: Ask HN
Hacker News: Ask HN
G
Google Developers Blog
L
LINUX DO - 热门话题
S
SegmentFault 最新的问题
S
Security @ Cisco Blogs
W
WeLiveSecurity
Stack Overflow Blog
Stack Overflow Blog
H
Help Net Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
L
LangChain Blog
人人都是产品经理
人人都是产品经理
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
H
Hackread – Cybersecurity News, Data Breaches, AI and More
F
Fortinet All Blogs
Apple Machine Learning Research
Apple Machine Learning Research
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Engineering at Meta
Engineering at Meta
SecWiki News
SecWiki News

DEV Community

Why Claude Code Sessions Diverge: A Mechanism Catalog Build a "Where to Watch" feature in 50 lines with the StreamWatchHub API Gemma 4 on Android: Tricks for Faster On-Device Inference Your AI agent has amnesia. You've just normalized it. 🚀 Reviving My Women Safety System – From Idea to Real-Time Smart Safety Solution I built an AI that reviews every PR automatically (because nobody was reviewing mine) 🌿 Git Mastery: The Complete Developer Guide Bringing Gemma 4 E2B to the Edge: Building a Privacy-First Dream Analyzer with Flutter & LiteRT Google I/O 2026 Wasn’t About Features — It Was About AI Becoming the Developer Environment Building an AI Vedic Astrology App in 25 Days — What Actually Worked (and What Didn't) Hermes Agent Has Four Memories — And That's Why It Doesn't Forget You Pressure Isn't Killing You -Your Relationship With It Is 🐳 How to Run Any Project in Docker: A Complete Guide AccessLens — a blind person's lanyard, powered by Gemma 4 on-device Glyph v0.2: the release is the joinery How I Built a Blazingly Fast, Privacy-First Batch Image Converter in the Browser Using OPFS and Web Workers Cómo solucionar \"Text content does not match server-rendered HTML\" en Next.js App Router FCoP 3.0: Why AI Agents Need a Track, Not a Brake Fibonacci: Quiz app which anyone can make revenue by viewing ads to the quiz contestants. The Subconscious Powered by Edge AI GPU Utilization Is Becoming the New Cloud Waste Crisis Cómo solucionar `docker run` con exit code 1 en Raspberry Pi JWT is a scam and your app doesn't need it 7 Agent Skill Packs That Actually Make AI Coders Better More Control, More Cost: Why Commanding AI Isn't Delegation SecureScan Synthadoc: We Built an AI Judge for Our AI Wiki Compiler - Here's What We Learned Cómo solucionar el error de permiso al ejecutar `pip.exe` en entorno virtual (Python 3.10 en Windows) Postgres-grade Serializable at 20k+ ops/s — on a laptop. Don’t try this at home. Pure Core, Imperative Shell in Rust with Stillwater Lean 4 for Programmers: Building a Todo List with Proof Trustless Bug Bounty Releases with a PoW-Gated DLC Oracle Building Autonomous DevOps Agents with MCP and LangChain Multimodal Gemma 4 Visual Regression & Patch Agent Git Time Machine — How Version Control Can Save Your Project My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. Read Replicas Lie About Consistency. 4 Sync Modes Behind the Lie. Reviving My Coding Project with GitHub Copilot I Tried Gemini 3.5 Flash After Google I/O 2026 - Here is What I Found :)) Zero-Cost AI in VS Code Blueprints Might Be More Important Than Frameworks AI CareCompanion - Offline Health Assistant Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse. I Built a Neural Network Engine in C# That Runs in Your Browser - No ONNX Runtime, No JavaScript Bridge, No Native Binaries An In-Depth Overview of the Apache Iceberg 1.11.0 Release Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector. How I Built a Multi-System Astrology Bot in Python (And What Meta Banned Me For) Gemma 4 Has Four Variants. Here's How to Pick the Right One Before You Write a Single Line of Code. Log Level Strategies: Balancing Observability and Cost Why WebMCP Is the Most Important Thing Google Announced at I/O 2026 (And Nobody's Talking About It) Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch Google's 2x Energy Efficiency Claim Is Real — But Here's What They're Not Measuring What's actually going on with CORS, under the hood Language-Agnostic Code Generation: The Driver Plugin Model Why We Rewrote Our Python CLI in Go (and What We Gained) I added up everything Google gives developers for free after I/O 2026. It's kind of absurd The Dawn of Smarter Apps: My Take on Google I/O 2026 AI Announcements Why AI Agents Like Hermes Need a Semantic Execution Layer for the Physical World Why We Built TestSmith: The Test Coverage Problem Nobody Talks About How to Convert Bank Statement PDFs to Excel: The Complete 2026 Guide Have You Ever Used a Website That Keeps Working After You Turn Off Your Internet? From idea to indexed: how I launched a SaaS in 60 days with Laravel + React Building a local-first AI tutor for my daughter (and 10–14 year-olds in Austrian schools) with Gemma 4 EC2 SSH Not Connecting? Here Are the 5 Things That Were Wrong (And How I Fixed Them) Best AI Tools for HVAC Contractors 2026 From Closed Internal Stack to Open-Source Ecosystem: I Finally Shipped Three Years of .NET Infrastructure Scrumpan is offlically LIVE!! Building a BMI Calculator CLI with TypeScript — Types, Functions, and Vitest From Building WordPress Websites to Node.js APIs: My Honest Full Stack Journey XiHan Snore Coach: Privacy-First On-Device MedTech Guardian powered by Gemma 4 Mobile Why AI Coding Agents Hallucinate and How to Fix It mcp-probe v1.4.0: Contract assertions for production MCP servers Google I/O 2026 Wasn't About One More Model. It Was About the Agent Stack. How I built 100+ crypto calculators in 6 languages on Astro The Dawn of Local Multi-Agent Architectures: Why Gemma 4 Changes Everything for Cloud Developers # I Told My AI to Simulate a Planet for 10,000 Years. It Built the Whole Thing Itself. 18/30 Days System Design Questions! From Hackathon Chaos to Clean CLI: Reviving My Daily Routine Analyser with GitHub Copilot Building a Home Lab with Proxmox and Terraform (for Kubernetes) PolicyAware vs Guardrails vs AI Gateways vs Model Routers: The Comparison Every AI Engineer Needs to Read Partner: An AI That Does Research While You Sleep Rugby Fundamentals as Software Concepts - Mapping the Pitch to your Code Base I Let Claude Code Run Unsupervised for 24 Hours. Here's What Happened. Why Zed Is Replacing VS Code in My AI-Augmented Workflow Build a scroll-driven WebGL hero in 30 lines Karpathy's LLM Wiki? No Code with Claude or Github Copilot! Why Platform Governance and Transparency Matter for Developers and Freelancers I built a Flutter CLI that generates Clean Architecture in seconds Using an LLM to automate a task that used to take hours by hand CyberArena – Interactive Cyber Security Simulation & Threat Analysis Platform Tile Extractor Mathematical Functions in CSS: clamp, min, max and How They Simplify Responsiveness Polyglot Persistence in Microservices: Let the Domain Choose the Database 190 Countries, Zero API Calls: Shipping Static Data in a Chrome Extension Your AI Writes Code Fast. Here’s How to Check It Before Shipping qwen2.5-coder is too slow for Claude Code on a Mac. Here's the fix. Building Automated Text-to-Video Pipelines with AI Can Gemini Become an Offline AI Tutor? Lessons from Building Educational AI OPRIX : From a simple messaging web app to a well structured and enhanced UI messaging web app
When One AI Agent Is Not Enough: A Practical Delegation Pattern for Enterprise Systems
Amit Kayal · 2026-05-24 · via DEV Community

When One AI Agent Is Not Enough: A Practical Delegation Pattern for Enterprise Systems

A lot of enterprise AI systems start the same way.

One agent.
One big prompt.
A bunch of tools.
A lot of hope.

At first, it looks great. The agent can answer questions, call a few systems, maybe even complete a useful workflow. But once the use case gets more realistic, cracks start to show.

The agent has to understand too much.
It has to access too many systems.
It has to make too many different kinds of decisions.
And when something goes wrong, it is hard to tell where the problem actually is.

That is usually the point where the issue stops being “prompt quality” and starts becoming “system design.”

One pattern I’ve found especially useful is delegation across agents and subagents.

Not because it sounds advanced.
Because it is often the more practical way to build enterprise AI.

The real problem with a single large agent

There is an appealing simplicity in saying, “Let one agent handle the whole thing.”

But enterprise workflows are rarely that clean.

Take something simple on the surface, like a customer escalation.

To handle it well, the system may need to:

  • pull ticket history
  • understand product context
  • check support policy
  • review account state
  • recommend next actions
  • trigger an internal workflow
  • draft a reply

Yes, one agent can try to do all of that.

But in practice, the more responsibilities you pile into one agent, the more fragile it becomes.

You usually end up with:

  • too much context going into one step
  • too many tools available to one component
  • weaker predictability
  • weaker governance
  • and much harder debugging

The system may still “work,” but it becomes difficult to trust.

A better pattern: one lead agent, a few focused subagents

The cleaner pattern is this:

Primary agent -> specialist subagents -> final outcome

The primary agent owns the workflow.

Its job is to understand the request, decide what needs to happen, delegate the right pieces of work, and then combine the results.

The subagents each do one thing well.

For example:

  • a retrieval subagent gets the right context
  • a policy subagent checks rules or entitlements
  • an analysis subagent recommends next steps
  • an execution subagent handles approved downstream actions
  • a communication subagent drafts the final message

That is a much healthier design than asking one broad agent to do everything in one pass.

Why this pattern works better

The first reason is simple: focus.

A retrieval subagent can focus on retrieval.
A policy subagent can focus on policy.
An execution subagent can focus on action.

You are not forcing one component to juggle too many responsibilities.

The second reason is control.

Different subagents can have different permissions, different tools, and different operating boundaries. That is much easier to govern in enterprise systems.

The third reason is observability.

If the outcome is wrong, you have a better shot at knowing where it went wrong:

  • bad retrieval
  • wrong policy interpretation
  • weak action selection
  • poor response generation

That is a huge advantage once the system moves beyond demo stage.

What the primary agent should actually do

One mistake I see is treating the primary agent like a simple router.

That is not enough.

The primary agent should behave more like a coordinator.

It should:

  • understand the incoming request
  • decide what subtasks are needed
  • choose the right subagents
  • pass only the necessary context
  • review what comes back
  • and decide whether to continue, retry, escalate, or stop

In other words, it owns the workflow logic.

It should not blindly trust every subagent output.
It should have judgment.

That is what makes delegation useful rather than just decorative.

What makes a good subagent

  • A good subagent is narrow.

That is probably the single most important design rule.

Each subagent should ideally have:

  • one clear job
  • limited tools
  • limited context
  • a defined output format
  • clear boundaries on what it should not do

If a subagent is doing retrieval, analysis, execution, and communication together, it is no longer a real specialist.

It is just another general-purpose agent with a different label. And once you do that, the value of delegation starts disappearing.

A sharper example

Let’s go back to the customer escalation example.

Bad design

One large agent receives the case and tries to:

  • read the issue
  • search past history
  • check policy
  • assess severity
  • decide the next action
  • update internal systems
  • draft the reply

This may work sometimes.

But it is too much responsibility in one place.

Better design

Primary agent
Owns the overall case flow.

Retrieval subagent
Gathers ticket history, account context, product details, and related documentation.

Policy subagent
Checks entitlement, SLA, escalation rules, and any support constraints.

Analysis subagent
Looks at the combined context and suggests the best next step.

Execution subagent
Triggers the approved workflow, creates tasks, or updates systems.

Communication subagent
Drafts the customer-facing or internal message.

Now the workflow is clearer.
Each step is easier to test.
And if the result is weak, you can usually tell why.

When delegation is worth it

Not every use case needs this pattern.

Sometimes one well-designed agent is enough.

Delegation becomes useful when:

  • the workflow crosses different domains
  • different systems or permissions are involved
  • some work can happen in parallel
  • one agent is becoming overloaded
  • governance starts getting messy
  • you want better testing and failure isolation

If the workflow is small and bounded, keep it simple.

The point is not to add more agents for the sake of it.
The point is to use delegation when specialization clearly improves the system.

Practical rules that help

1. Start with a small number of subagents

Do not build a maze.

Start with one primary agent and maybe two or three specialists. That is usually enough to prove whether the pattern is helping.

2. Keep context tight

Do not pass everything to every agent.

Each subagent should get only the context it actually needs. Too much context often makes outputs worse, not better.

3. Use structured outputs

Subagents should return something predictable:

  • a decision
  • a label
  • a ranked list
  • a JSON object
  • a recommendation plus confidence

Not vague prose that another component has to guess at.

4. Design low-confidence paths

If a subagent is not confident, that should trigger something explicit:

  • retry
  • clarification
  • fallback logic
  • human review

Do not let weak outputs quietly flow into the rest of the chain.

5. Log the handoffs

You need to know:

  • what task was delegated
  • what context was passed
  • what came back
  • what happened next

Without that, debugging becomes painful very quickly.

6. Control tools by role

A retrieval subagent should not have broad execution rights.
An execution subagent should not have unnecessary access to everything.
Different responsibilities should have different permissions.

That is one of the easiest ways to keep governance strong.

Common mistakes

A few patterns show up again and again.

Too many agents too early
More moving parts do not automatically make the design better.

Subagents with overlapping jobs
If roles are fuzzy, delegation becomes noisy.

Passing all context everywhere
That weakens specialization fast.

No fallback design
One failed subtask should not silently break the whole workflow.

This is an architecture pattern.

Final thought

Delegation across agents and subagents is one of the more practical patterns in enterprise AI.

Not because it is clever.
Because it reflects how real systems usually need to operate.

The strongest setups are usually not the ones with the most agents.

They are the ones where:

  • the primary agent clearly owns the workflow
  • the subagents are genuinely specialized
  • the context is controlled
  • the outputs are structured
  • and the operating model is easy to debug and govern

That is what turns a multi-agent design from an interesting idea into something you can actually run in production.