惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

爱范儿
爱范儿
博客园_首页
W
WeLiveSecurity
S
Secure Thoughts
S
Security @ Cisco Blogs
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Hugging Face - Blog
Hugging Face - Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
H
Hacker News: Front Page
Project Zero
Project Zero
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
U
Unit 42
N
News and Events Feed by Topic
N
News and Events Feed by Topic
Hacker News - Newest:
Hacker News - Newest: "LLM"
Forbes - Security
Forbes - Security
T
Tor Project blog
I
Intezer
B
Blog
F
Full Disclosure
Security Archives - TechRepublic
Security Archives - TechRepublic
F
Fortinet All Blogs
Schneier on Security
Schneier on Security
T
Threat Research - Cisco Blogs
AI
AI
Google DeepMind News
Google DeepMind News
L
LINUX DO - 最新话题
Cloudbric
Cloudbric
L
Lohrmann on Cybersecurity
WordPress大学
WordPress大学
博客园 - 聂微东
雷峰网
雷峰网
P
Privacy International News Feed
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
PCI Perspectives
PCI Perspectives
Y
Y Combinator Blog
Spread Privacy
Spread Privacy
Simon Willison's Weblog
Simon Willison's Weblog
罗磊的独立博客
Vercel News
Vercel News
A
Arctic Wolf
The Register - Security
The Register - Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Microsoft Azure Blog
Microsoft Azure Blog
H
Heimdal Security Blog
Know Your Adversary
Know Your Adversary
P
Proofpoint News Feed
C
Cybersecurity and Infrastructure Security Agency CISA
P
Proofpoint News Feed

Kyle Redelinghuys

How I Got the UK Global Talent Visa as a Software Engineer SrvMon: Self-Hosted Server Monitoring Built in Go Claude Cowork: Closing the Gap Between Coding and Knowledge Work Teaching a Transformer to Read DNA: How EabhaSeq Works Claude Code Agents & Subagents: What They Actually Unlock AI Agent Context Management: What I Built in Cont3xt I Built a Claude Code Cost Tracker - Was Max Worth It? Claude Code Pricing Guide: Which Plan Saves You Money OpenClaw: How I Built a Personal AI Operations Centre on Linux Claude Code Hooks: Automate Your AI Coding Workflow Have Anthropic Already Won the AI Race? Sonde: An AI Tool for Solving Complex Organisational Problems Open Sourcing EabhaSeq: Synthetic cfDNA for NIPT Research SoupaWhisper: Free SuperWhisper Alternative for Linux (Open Source)
Claude Agent SDK: Subagents, Sessions and Why It's Worth It
Kyle Redelinghuys · 2026-03-02 · via Kyle Redelinghuys

I write a weekly newsletter covering what I've found actually works with AI coding tools, Go, and building products. If you want the configs, costs and workflows I use daily, it's worth subscribing./p>

Join the newsletter - it's free


I've been wanting to build a proper multi-agent setup for a while, and every time I sat down to figure out how to do it cleanly against the raw Anthropic API, I ended up writing boilerplate I didn't want to write. Checking stop reasons, feeding tool results back into the message array, managing context as it grows, deciding what to trim when you hit limits. None of it is hard, but all of it is work that isn't the actual problem you're trying to solve. I've been reading through the Claude Agent SDK docs and experimenting with it over the past few weeks, and it's a more complete answer to this than I expected.

What it is

The Claude Agent SDK is the infrastructure that powers Claude Code, exposed as a library. Anthropic renamed it from the "Claude Code SDK" in September 2025, which reflects that the runtime they built for a coding tool had become general enough to use for any agentic workflow. It's available as @anthropic-ai/claude-agent-sdk in TypeScript and claude-agent-sdk in Python.

The core of it is a query() function that returns an async generator, yielding typed messages as the agent works through a task.

import { query } from "@anthropic-ai/claude-agent-sdk";

for await (const message of query({
  prompt: "Review the authentication module for security issues",
  options: {
    model: "sonnet",
    allowedTools: ["Read", "Glob", "Grep"],
    permissionMode: "acceptEdits",
    maxTurns: 50,
  }
})) {
  if (message.type === "result" && message.subtype === "success") {
    console.log(message.result);
    console.log(`Cost: $${message.total_cost_usd}`);
  }
}

What you're not doing here is writing the agent loop. You're not checking stop_reason === "tool_use", executing tools, and feeding results back. The SDK handles that, along with automatic context compaction when the conversation approaches its limits and re-reading any CLAUDE.md instruction files after compaction. There are 14+ built-in tools - Read, Write, Bash, Glob, Grep, WebSearch, WebFetch and more - referenced by name as strings rather than defined as JSON schemas. You just list what the agent is allowed to use and let it get on with it.

Subagent orchestration

The reason I was drawn to this in the first place is the subagent model, which solves the orchestration problem cleanly. You define agent types in an agents parameter, each with its own description, system prompt, restricted tool access, and optionally a different model. You include Task in allowedTools, and when Claude decides a subtask fits one of those definitions, it spawns the subagent, gives it only the specific task, and gets back only the final result.

For a code review pipeline, this looks like:

for await (const message of query({
  prompt: "Do a full review of this codebase - security, test coverage, and performance.",
  options: {
    allowedTools: ["Read", "Glob", "Grep", "Task"],
    agents: {
      "security-reviewer": {
        description: "Identifies security vulnerabilities, injection risks, and auth issues",
        prompt: "You are a security specialist. Focus on SQL injection, XSS, CSRF, secrets exposure, and authentication bypass.",
        tools: ["Read", "Grep", "Glob"],
        model: "opus"
      },
      "test-analyzer": {
        description: "Analyses test coverage gaps and test quality",
        prompt: "Review test coverage, missing edge cases, and overall test quality.",
        tools: ["Read", "Grep", "Glob"],
        model: "haiku"
      },
      "performance-reviewer": {
        description: "Identifies performance bottlenecks and inefficient patterns",
        prompt: "Look for N+1 queries, unnecessary allocations, blocking operations, and slow algorithms.",
        tools: ["Read", "Grep", "Glob"],
        model: "sonnet"
      }
    }
  }
})) {
  if (message.type === "result") {
    console.log(message.result);
  }
}

A few things worth noting. Task must be in allowedTools or subagents never get spawned. Never include Task in a subagent's own tools, because subagents can't spawn their own subagents. And Claude decides when to parallelise - you're defining the capability, not the scheduling.

The context isolation is what makes this genuinely useful. Each subagent gets its own context window, so the security reviewer's deep read of the auth code doesn't pollute the performance reviewer's analysis, and you're not managing three concurrent conversation threads yourself. The subagents run, return their findings, and the parent stitches it together. I've been doing this kind of thing manually before when building with Claude Code hooks and the difference in how much glue code you need to write is significant.

Session persistence

Every query() call without a resume parameter starts a fresh session with no memory of anything before it. Capture the session_id from the init message and pass it back on the next call, and you continue exactly where you left off.

let sessionId: string;

for await (const msg of query({
  prompt: "Analyse this codebase and identify the three highest-priority security issues.",
  options: { allowedTools: ["Read", "Glob", "Grep"] }
})) {
  if (msg.type === "system" && msg.subtype === "init") sessionId = msg.session_id;
  if (msg.type === "result") console.log(msg.result);
}

// Continue with write access once you're happy with what the analysis found
for await (const msg of query({
  prompt: "Now fix those three issues.",
  options: {
    resume: sessionId,
    allowedTools: ["Read", "Edit", "Write"],
    permissionMode: "acceptEdits",
  }
})) {
  if (msg.type === "result") console.log(msg.result);
}

The practical value is being able to separate the read phase from the write phase - run analysis with read-only tools, review the output, then continue with write permissions only if you're happy with what it found. You can also forkSession to branch from an existing session if you want to try different approaches without losing the analysis context.

The trade-offs

The trade-off is real and worth being upfront about. You're locked into Claude models - there's no swapping in a different provider. The SDK runs by spawning a Claude Code CLI process as a subprocess rather than being a pure API library, which adds some deployment complexity. And agentic sessions that do substantial work can get expensive quickly, which matters a lot if you're building this into any kind of automated pipeline. I wrote about tracking Claude Code costs earlier, and the max_budget_usd parameter on query() is worth using from the start rather than finding out the hard way.

The broader picture is that Anthropic is building a three-layer stack: MCP as the protocol for agent-tool communication, Agent Skills as portable capability packages, and the Claude Agent SDK as the runtime. The MCP work I've been following sits at that protocol layer, and the SDK has MCP support built in, so custom tools defined as MCP servers plug in cleanly through the same allowedTools pattern.

I'm still in the exploration phase with this, and the API is moving - there's already a V2 session-based interface alongside the generator pattern in 0.2.x. But the subagent model and session persistence are solid enough to build on, and they solve the specific problems I was running into cleanly enough that I'll be using this properly rather than rolling my own agent loop.