惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

阮一峰的网络日志
阮一峰的网络日志
D
Darknet – Hacking Tools, Hacker News & Cyber Security
S
Schneier on Security
The Last Watchdog
The Last Watchdog
Cyberwarzone
Cyberwarzone
S
Securelist
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
C
Cyber Attacks, Cyber Crime and Cyber Security
L
Lohrmann on Cybersecurity
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - 司徒正美
The Cloudflare Blog
V
V2EX
博客园_首页
博客园 - 聂微东
Vercel News
Vercel News
人人都是产品经理
人人都是产品经理
G
GRAHAM CLULEY
T
Tenable Blog
Last Week in AI
Last Week in AI
Y
Y Combinator Blog
L
LINUX DO - 最新话题
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
SecWiki News
SecWiki News
博客园 - 三生石上(FineUI控件)
S
Secure Thoughts
N
News | PayPal Newsroom
T
The Blog of Author Tim Ferriss
The GitHub Blog
The GitHub Blog
T
Troy Hunt's Blog
博客园 - 【当耐特】
Forbes - Security
Forbes - Security
H
Hacker News: Front Page
A
About on SuperTechFans
B
Blog RSS Feed
Engineering at Meta
Engineering at Meta
MongoDB | Blog
MongoDB | Blog
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
罗磊的独立博客
D
DataBreaches.Net
P
Privacy & Cybersecurity Law Blog
Schneier on Security
Schneier on Security
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Google DeepMind News
Google DeepMind News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Jina AI
Jina AI
D
Docker
P
Proofpoint News Feed

年华转瞬

Pensieve: 2606 Pensieve: 2605 Pensieve: 2604 跟着AI学编译原理 Pensieve: 2603 Pensieve: 2602 Pensieve: 2601 GLM ASR试用 Pensieve: 2512 Pensieve: 2511
Context
夏恺, Kai Xia, xiaket · 2026-03-01 · via 年华转瞬

2026-03-01 10:35

This is my very personal, very meta take on AI tooling — not a how-to, just a lens: treat an LLM like an improv comedian. The model is the performer; the context is whatever the performer has to work with — the audience suggestions, the rules of the game, and everything said so far. Change that input, and you change the performance. It’s a simple framing, but it matters, because most of the “new” tooling we talk about later is really just different ways of shaping context.

If we go back a bit — when ChatGPT first went mainstream — prompt engineering became a buzzword. The idea was that the right wording could coax noticeably better output. That hype has cooled, especially as a standalone job title, but not because the practice vanished. My take is that, back then, the comedian simply wasn’t that strong yet, so you had to feed it very specific suggestions to get a good bit. Today, the comedian is better, and the craft has shifted: less about finding magic phrases, more about shaping context and building workflows. In that sense, the kaleidoscopic tools we have today are still extending prompt engineering — they’re just turning it into scaffolding, templates, and context choreography.

Fast forward to today, we live in an age of abundance in AI models. There are a lot of capable performers, and within the top tier the difference is often subtle — and highly task-dependent. So the leverage moves to the room you put them in: the better you can shape the context, the better outcomes you tend to get. Give a decent comedian great suggestions and a clear setup, and they can outperform a better comedian stuck with a confused, noisy room. Context really matters.

How do we control context? Tools can help, but most of them are generic — they don’t really know what you consider signal versus noise, so they can’t safely curate the room for you. For example, Claude Code can compact the conversation to make more room for what comes next. But if you toss in an unrelated question halfway through, or dump a giant chunk of raw logs into the chat, the performer is now doing improv in a noisy room: it becomes harder to track what matters, what’s incidental, and what should be ignored.

At this point, you might think we’re powerless. We’re not. A whole set of inventions exists for one purpose: keep the context window aligned with the problem we’re actually trying to solve.

Let’s start with MCP. To me, it’s a great invention not because it’s yet another integration, but because it changes how much integration knowledge you have to carry in the context window. MCP gives the model a small, discoverable interface to external systems — tools, resources, and even reusable prompts — via clear schemas and structured inputs/outputs. In practice, an MCP server often fronts one or more APIs, but it doesn’t have to expose the entire surface area. It can present a curated menu of capabilities with sane defaults. The LLM sees a menu of dishes, not a book full of recipes: instead of pasting curl/auth/swagger docs into the chat, you give it a handful of typed tool calls. That keeps the context lean and the model focused on the task.

Next, I’d like to talk about subagents. In my opinion, this is another great invention: instead of stuffing everything into one ever-growing chat, you delegate a narrowly scoped task to a separate agent running in its own context window (often with its own prompt and tool permissions), then bring the result back to the main thread. The goal isn’t “zero context” so much as “clean context”: keep the orchestrator’s context focused on the plan and the decisions, not the churn. Debugging is a classic example. Logs are high-volume and usually low-signal, and they can quickly drown out what you actually care about. A subagent can wade through the noise in isolation and can be instructed/designed to return a short, decision-ready summary so the rest of the workflow doesn’t get contaminated.

I don’t see skills as a breakthrough. To me, it’s prompt engineering with better ergonomics: modular, shareable, and loaded on demand. That’s a real UX win. But context-wise, it’s an incremental improvement, not a new paradigm. You’re not eliminating instruction tokens — you’re moving them out of the chat transcript and pulling them in when relevant. Once loaded, those tokens still compete with conversation history and everything else. If anything, a large catalog of Skills can create a new kind of noise: routing and discoverability overhead.

While I know Geoffrey Huntley personally, Ralph Loop still doesn’t feel like a new kind of context management to me. It’s a solid reliability pattern: keep the long-term state outside the conversation, restart with a clean context each round, do one small unit of work, validate it, then loop. That can be very effective precisely because it avoids context rot. But it’s not the “art” I’m seeking. It’s less about shaping the room and more about clearing the room over and over — paying iteration cost for robustness.

Gas Town adds real value in coordination and persistence — it stores work state outside the chat in git-backed hooks/ledgers — but at the end of the day it still inherits the strengths and limits of the underlying agent runtime (e.g., Claude Code or Codex). So if you only care about token-level context shaping, it may feel like ‘nothing new’; if you care about multi-agent reliability, it’s a different story.

So if we zoom out, a lot of AI tooling innovation is really context management innovation — not making the comedian magically smarter, but managing what the comedian gets to hear, when, and at what bandwidth. Different tools optimize different bottlenecks — token budget, noise isolation, state persistence, and coordination. The “art”, for me, is knowing which bottleneck you’re hitting and picking the smallest intervention that keeps the room coherent.

夏恺/Kai Xia/xiaket

Bookworm/Pythonista/DevOps/Melburnian