惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

C
CXSECURITY Database RSS Feed - CXSecurity.com
Stack Overflow Blog
Stack Overflow Blog
月光博客
月光博客
T
Threat Research - Cisco Blogs
小众软件
小众软件
有赞技术团队
有赞技术团队
酷 壳 – CoolShell
酷 壳 – CoolShell
Apple Machine Learning Research
Apple Machine Learning Research
C
Cyber Attacks, Cyber Crime and Cyber Security
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
T
Tailwind CSS Blog
Cisco Talos Blog
Cisco Talos Blog
V
V2EX
博客园 - 【当耐特】
C
Cybersecurity and Infrastructure Security Agency CISA
Hugging Face - Blog
Hugging Face - Blog
The Cloudflare Blog
The Last Watchdog
The Last Watchdog
Simon Willison's Weblog
Simon Willison's Weblog
T
Threatpost
S
Secure Thoughts
O
OpenAI News
P
Proofpoint News Feed
S
SegmentFault 最新的问题
Forbes - Security
Forbes - Security
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
Application and Cybersecurity Blog
Application and Cybersecurity Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Last Week in AI
Last Week in AI
宝玉的分享
宝玉的分享
Scott Helme
Scott Helme
T
Tenable Blog
A
Arctic Wolf
L
LINUX DO - 热门话题
爱范儿
爱范儿
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
www.infosecurity-magazine.com
www.infosecurity-magazine.com
V
Visual Studio Blog
Hacker News: Ask HN
Hacker News: Ask HN
Hacker News - Newest:
Hacker News - Newest: "LLM"
腾讯CDC
博客园 - Franky
WordPress大学
WordPress大学
Know Your Adversary
Know Your Adversary
博客园_首页
雷峰网
雷峰网
IT之家
IT之家
PCI Perspectives
PCI Perspectives
L
LINUX DO - 最新话题
H
Heimdal Security Blog

Kyle Redelinghuys

How I Got the UK Global Talent Visa as a Software Engineer SrvMon: Self-Hosted Server Monitoring Built in Go Claude Cowork: Closing the Gap Between Coding and Knowledge Work Teaching a Transformer to Read DNA: How EabhaSeq Works AI Agent Context Management: What I Built in Cont3xt Claude Agent SDK: Subagents, Sessions and Why It's Worth It I Built a Claude Code Cost Tracker - Was Max Worth It? Claude Code Pricing Guide: Which Plan Saves You Money OpenClaw: How I Built a Personal AI Operations Centre on Linux Claude Code Hooks: Automate Your AI Coding Workflow Have Anthropic Already Won the AI Race? Sonde: An AI Tool for Solving Complex Organisational Problems Open Sourcing EabhaSeq: Synthetic cfDNA for NIPT Research SoupaWhisper: Free SuperWhisper Alternative for Linux (Open Source)
Claude Code Agents & Subagents: What They Actually Unlock
Kyle Redelinghuys · 2026-03-16 · via Kyle Redelinghuys

I write a weekly newsletter covering what I've found actually works with AI coding tools, Go, and building products. If you want the configs, costs and workflows I use daily, it's worth subscribing./p>

Join the newsletter - it's free


I set up Claude Code agent files when the feature first landed. Created a few .claude/agents/ definitions, gave them names and tool restrictions, felt good about it, and then gradually stopped thinking about them. At some point Claude just started handling things well enough on its own that the agent files sat there gathering dust. They're still in my repos. They just don't get called.

I kept telling myself I wasn't missing anything, but context quality on my medium-to-large solo projects had started to feel off, with responses getting vaguer as sessions grew and the model losing track of decisions made earlier in a conversation. I was working around it using Cont3xt.dev, a tool I built specifically to manage AI context, and that helped, but it felt like I was solving a symptom rather than understanding the actual problem. So I went back and dug properly into what agents and subagents actually do, and more importantly what they unlock that a single-agent session can't.

The context window is the whole story

Standard Claude Code gives you a 200K-token context window per session. That sounds enormous until you're in a multi-hour session on a project with a dozen files open, a long conversation history, and tool call outputs stacking up. By the time you hit two-thirds capacity, response quality degrades noticeably, not because the model is worse, but because the context is full of noise and the model has to attend to all of it equally. I'd been experiencing this without quite naming it.

Subagents solve this by giving each delegated task its own isolated 200K-token context. The parent agent spawns a subagent with a specific prompt, the subagent does its work, reads files, runs searches, makes tool calls, and returns only its final output to the parent, be that a summary, a result, or a recommendation. All the intermediate noise stays inside the subagent's context and never touches the parent's conversation. The parent gets the signal, not the noise.

This is the actual value. Not parallelism, not specialisation, not the organisational tidiness of named agents. It's that isolation prevents the context rot that compounds over long sessions.

What the architecture looks like

The orchestrator-worker pattern is fairly simple. A parent agent analyses a task, decides whether to handle it directly or delegate it, and uses the Agent tool (previously called the Task tool, and both names still work) to spin up a subagent with a prompt string. The subagent runs with its own context, tool access, and permissions, then returns a single final message. Subagents cannot spawn further subagents, which keeps the nesting manageable.

Claude ships with three built-in subagent types. The Explore type handles read-only file discovery and codebase search, running on Haiku by default for speed and cost. The Plan type gathers context before presenting a strategy in plan mode. The General-purpose type handles anything involving both exploration and modification. Claude routes to these automatically based on task characteristics, though the auto-selection is imperfect in practice, and more on that below.

Custom agents are defined as Markdown files with YAML frontmatter, stored in .claude/agents/ at project scope or ~/.claude/agents/ at user scope. A basic definition looks like this:

---
name: code-reviewer
description: Expert code review specialist. Use immediately after modifying code.
tools: Read, Grep, Glob
model: sonnet
permissionMode: default
---
You are a senior code reviewer checking for bugs, security issues, and code quality.
Review any code changes and return a concise list of specific findings.

The tools field does something genuinely useful here: it physically restricts what the subagent can do. A reviewer defined with only Read, Grep, Glob cannot write files. That's not a naming convention or a prompt instruction, it's a hard constraint. For a solo developer running with broad permissions, having a review agent that structurally cannot modify code is worth something.

The model field lets you route different tasks to different models. Haiku for cheap exploratory reads, Sonnet for standard implementation, Opus for complex reasoning. On pay-as-you-go pricing the cost difference between models is substantial, and there's no reason a "find all files that import this package" task needs Opus.

The real cost picture

Subagents are not free. Each spawned agent opens its own context window, which means tokens multiply quickly. Anthropic's own documentation notes that multi-agent workflows use roughly 4-7x more tokens than single-agent sessions, and Agent Teams (the experimental multi-session variant announced in February 2026) run at roughly 15x standard usage. If you're on the API and paying per token, that multiplier matters.

I've written in detail about the Claude Code pricing options and the Max plan economics, and the core finding holds: over 90% of tokens in a typical heavy session are prompt cache reads at $0.50/MTok for Opus, which dramatically softens the apparent cost of subagent expansion. But the multiplier is still real, and running five parallel subagents burning through exploratory reads simultaneously on the Pro plan is a reliable way to hit rate limits in under twenty minutes.

The sensible approach is narrow scoping: use subagents for read-heavy, bounded tasks with a clear output, and keep the main session for anything requiring sustained, cross-cutting context. Don't spawn agents because you can.

Where they actually help

The most consistent community finding, which matches the logic of the architecture, is that subagents work best for read-heavy research and exploration, not parallel coding. A subagent sent to find all places a particular function is called, summarise a subsystem's behaviour, or check whether a proposed change would break any existing contracts will produce a small, clean output and keep its exploration cost internal. That's the use case the architecture is optimised for.

The C compiler example Anthropic uses as a flagship demonstration is instructive: 16 Opus agents, 2,000 sessions over two weeks, $20,000 in API costs, building a Rust C compiler from scratch. Impressive, and structurally possible only with subagents. Also completely inappropriate as a model for a solo developer on a SaaS product. The lesson from that project that does transfer is the decomposition principle: each agent worked on an independent failing test, with no cross-agent dependencies. When tasks are truly independent, parallel agents compound your speed. When they're coupled, you get coordination overhead and conflicting changes.

For medium-to-large solo projects, the pattern I find most defensible is 2-3 focused information-gathering agents running in parallel, with the main session synthesising their outputs and making decisions. An agent that reads and summarises all test failures. An agent that checks the database schema for a relevant table. An agent that scans for existing implementations of a pattern you're about to add. All of them returning concise outputs to a main session that then acts. That's meaningfully better than a single session doing all of those reads sequentially, because the context stays clean.

What doesn't work well yet

Auto-selection of custom agents remains unreliable. Claude frequently handles tasks in the main session rather than delegating to a defined agent, even when the agent is explicitly relevant and its description matches the task. The only reliable trigger is explicit invocation, which defeats the purpose of automatic routing for anyone who wants a seamless workflow. There are open GitHub issues on this, and it's a known gap, not an edge case.

Claude Opus 4.6 has a known tendency to over-spawn subagents. Anthropic's own prompt engineering documentation flags it: Opus will delegate to agents in situations where a direct approach would be faster and cheaper. If you're on Opus and wondering why a simple task consumed 50K tokens, an unnecessary delegation is a likely cause.

And there's no native observability. No trace view, no per-agent cost breakdown, no way to see what a running subagent is doing without looking at raw outputs. If you're building on the Claude Agent SDK, third-party tools fill some of this gap, but in the terminal Claude Code workflow you're largely flying blind on costs and subagent activity.

The hooks connection

Worth noting for anyone already using Claude Code hooks: hooks interact with subagents through dedicated lifecycle events, SubagentStart and SubagentStop, which means you can instrument your subagent activity, apply tool-level restrictions via PreToolUse, and validate outputs before they reach the parent. If you're already invested in hooks, the subagent lifecycle events add a meaningful layer of control over what agents can actually do.

Whether it's worth revisiting

I went into this research expecting to find that the feature had matured and I was missing something obvious. The answer is more nuanced. The core innovation, isolated context windows that keep exploration noise out of the main session, is genuinely valuable and solves a real problem I was experiencing on larger projects. The custom agent definitions give you a readable, version-controlled way to encode tool restrictions and model routing decisions that actually enforce behaviour rather than just prompting for it.

What hasn't fully landed is reliable automatic routing, which means you're often writing explicit invocations rather than building a system that knows when to delegate. And the cost multiplier is real enough that undisciplined subagent use will hurt you on any plan with token limits.

The old agent files in my repos are worth revisiting. Not as a multi-agent system with coordinating roles and specialised responsibilities, but as a small library of focused, read-only information gatherers that I explicitly invoke when I need clean, isolated context for a bounded research task. That's a narrower use case than the documentation implies, but it's one that actually maps to the architecture's strengths.