惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

MyScale Blog
MyScale Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Google DeepMind News
Google DeepMind News
C
Cisco Blogs
量子位
WordPress大学
WordPress大学
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
C
Comments on: Blog
Blog — PlanetScale
Blog — PlanetScale
PCI Perspectives
PCI Perspectives
Martin Fowler
Martin Fowler
云风的 BLOG
云风的 BLOG
博客园 - 司徒正美
D
DataBreaches.Net
T
The Exploit Database - CXSecurity.com
有赞技术团队
有赞技术团队
Hugging Face - Blog
Hugging Face - Blog
Simon Willison's Weblog
Simon Willison's Weblog
Stack Overflow Blog
Stack Overflow Blog
月光博客
月光博客
T
Troy Hunt's Blog
L
Lohrmann on Cybersecurity
L
LangChain Blog
Security Latest
Security Latest
A
Arctic Wolf
博客园 - Franky
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
C
Check Point Blog
V
Vulnerabilities – Threatpost
博客园 - 聂微东
SecWiki News
SecWiki News
H
Hackread – Cybersecurity News, Data Breaches, AI and More
I
Intezer
腾讯CDC
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
N
News and Events Feed by Topic
E
Exploit-DB.com RSS Feed
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Engineering at Meta
Engineering at Meta
Microsoft Security Blog
Microsoft Security Blog
Google DeepMind News
Google DeepMind News
Spread Privacy
Spread Privacy
Recorded Future
Recorded Future
C
CERT Recently Published Vulnerability Notes
Last Week in AI
Last Week in AI
大猫的无限游戏
大猫的无限游戏
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
小众软件
小众软件

DEV Community

🐳 How to Run Any Project in Docker: A Complete Guide AccessLens — a blind person's lanyard, powered by Gemma 4 on-device Glyph v0.2: the release is the joinery FCoP 3.0: Why AI Agents Need a Track, Not a Brake The Subconscious Powered by Edge AI GPU Utilization Is Becoming the New Cloud Waste Crisis Cómo solucionar `docker run` con exit code 1 en Raspberry Pi JWT is a scam and your app doesn't need it 7 Agent Skill Packs That Actually Make AI Coders Better SecureScan Synthadoc: We Built an AI Judge for Our AI Wiki Compiler - Here's What We Learned Cómo solucionar el error de permiso al ejecutar `pip.exe` en entorno virtual (Python 3.10 en Windows) Postgres-grade Serializable at 20k+ ops/s — on a laptop. Don’t try this at home. Pure Core, Imperative Shell in Rust with Stillwater Lean 4 for Programmers: Building a Todo List with Proof Trustless Bug Bounty Releases with a PoW-Gated DLC Oracle Building Autonomous DevOps Agents with MCP and LangChain Multimodal Gemma 4 Visual Regression & Patch Agent Git Time Machine — How Version Control Can Save Your Project My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable. Read Replicas Lie About Consistency. 4 Sync Modes Behind the Lie. Reviving My Coding Project with GitHub Copilot I Tried Gemini 3.5 Flash After Google I/O 2026 - Here is What I Found :)) Zero-Cost AI in VS Code Blueprints Might Be More Important Than Frameworks AI CareCompanion - Offline Health Assistant Long-Context Models Killed RAG. Except for the 6 Cases Where They Made It Worse. I Built a Neural Network Engine in C# That Runs in Your Browser - No ONNX Runtime, No JavaScript Bridge, No Native Binaries An In-Depth Overview of the Apache Iceberg 1.11.0 Release Your Agent Just Called the Same Tool 47 Times. Here's the 20-Line Detector. How I Built a Multi-System Astrology Bot in Python (And What Meta Banned Me For) Gemma 4 Has Four Variants. Here's How to Pick the Right One Before You Write a Single Line of Code. Log Level Strategies: Balancing Observability and Cost Why WebMCP Is the Most Important Thing Google Announced at I/O 2026 (And Nobody's Talking About It) Making LLM Calls Reliable: Retry, Semaphore, Cache, and Batch Google's 2x Energy Efficiency Claim Is Real — But Here's What They're Not Measuring What's actually going on with CORS, under the hood Language-Agnostic Code Generation: The Driver Plugin Model Why We Rewrote Our Python CLI in Go (and What We Gained) I added up everything Google gives developers for free after I/O 2026. It's kind of absurd The Dawn of Smarter Apps: My Take on Google I/O 2026 AI Announcements Why AI Agents Like Hermes Need a Semantic Execution Layer for the Physical World Why We Built TestSmith: The Test Coverage Problem Nobody Talks About How to Convert Bank Statement PDFs to Excel: The Complete 2026 Guide Have You Ever Used a Website That Keeps Working After You Turn Off Your Internet? From idea to indexed: how I launched a SaaS in 60 days with Laravel + React Building a local-first AI tutor for my daughter (and 10–14 year-olds in Austrian schools) with Gemma 4 EC2 SSH Not Connecting? Here Are the 5 Things That Were Wrong (And How I Fixed Them) Best AI Tools for HVAC Contractors 2026 From Closed Internal Stack to Open-Source Ecosystem: I Finally Shipped Three Years of .NET Infrastructure Scrumpan is offlically LIVE!! Building a BMI Calculator CLI with TypeScript — Types, Functions, and Vitest From Building WordPress Websites to Node.js APIs: My Honest Full Stack Journey XiHan Snore Coach: Privacy-First On-Device MedTech Guardian powered by Gemma 4 Mobile Why AI Coding Agents Hallucinate and How to Fix It mcp-probe v1.4.0: Contract assertions for production MCP servers Google I/O 2026 Wasn't About One More Model. It Was About the Agent Stack. How I built 100+ crypto calculators in 6 languages on Astro The Dawn of Local Multi-Agent Architectures: Why Gemma 4 Changes Everything for Cloud Developers # I Told My AI to Simulate a Planet for 10,000 Years. It Built the Whole Thing Itself. 18/30 Days System Design Questions! From Hackathon Chaos to Clean CLI: Reviving My Daily Routine Analyser with GitHub Copilot Building a Home Lab with Proxmox and Terraform (for Kubernetes) PolicyAware vs Guardrails vs AI Gateways vs Model Routers: The Comparison Every AI Engineer Needs to Read Partner: An AI That Does Research While You Sleep Rugby Fundamentals as Software Concepts - Mapping the Pitch to your Code Base I Let Claude Code Run Unsupervised for 24 Hours. Here's What Happened. Why Zed Is Replacing VS Code in My AI-Augmented Workflow Build a scroll-driven WebGL hero in 30 lines Karpathy's LLM Wiki? No Code with Claude or Github Copilot! Why Platform Governance and Transparency Matter for Developers and Freelancers I built a Flutter CLI that generates Clean Architecture in seconds Using an LLM to automate a task that used to take hours by hand CyberArena – Interactive Cyber Security Simulation & Threat Analysis Platform Tile Extractor Mathematical Functions in CSS: clamp, min, max and How They Simplify Responsiveness Polyglot Persistence in Microservices: Let the Domain Choose the Database 190 Countries, Zero API Calls: Shipping Static Data in a Chrome Extension Your AI Writes Code Fast. Here’s How to Check It Before Shipping qwen2.5-coder is too slow for Claude Code on a Mac. Here's the fix. Building Automated Text-to-Video Pipelines with AI Can Gemini Become an Offline AI Tutor? Lessons from Building Educational AI OPRIX : From a simple messaging web app to a well structured and enhanced UI messaging web app Why React + TypeScript Nullability Slowly Becomes Exhausting Why AI Agents Need a Project Layer - Part 1 Stop Hand-Editing MCP Configs: A Zero-Dependency Go CLI What I Learned Working With Microsoft, SQUAD(GTCO), and Different Tech Communities 🧠 Hermes Agent Assistant — A Modular AI Agent System with Planner, Executor & Memory Spring Boot Auto-Configuration Source Code: Nail This Interview Question The Ultimate Guide to Free AI API Keys: 6 Platforms You Need to Know Why 91% of AI Agents Fail in Production (And What the 9% Do Differently) TryHackMe | Battery | WALKTHROUGH Stop Guessing Your Regex — Test It Live in the Browser I Built FreelancEye, an Open-Source Mobile PWA for Finding Clients Beyond the Hype: My Production Playbook for Docker Swarm Top AI App Builder Platforms with Integrated Backend, Hosting & Database ECS vs EKS in 2026: An Honest Comparison from Someone Who Has Run Both in Production Hardening Your Node.js App Against Supply Chain & Remote Code Execution Attacks linux commands
More Control, More Cost: Why Commanding AI Isn't Delegation
synthaicode · 2026-05-24 · via DEV Community

Yesterday, you typed /format.

Checked the output. Typed /refactor. Checked again. Typed /test.

You finished the session feeling productive. The AI did the work. You supervised.

That's not delegation. That's shift work.


A note on framing: This article traces a structural pattern — not a documented changelog. The "Command Era" and "Harness Era" described below are not precise historical dates. They are recurring failure modes, observable across teams and tools, that tend to appear in this sequence. Read it as structural history, not product timeline.


Chapter 1: The Command Era — We Gave AI More to Do, and Did More Ourselves

When AI Skills became a shared convention, it felt like a breakthrough. Skill-sharing sites appeared. You could /summarize, /diagram, /translate, /review. The list kept growing.

Then came the Format Wars.

How should a Skill file be structured? Which headers does the AI actually read? What syntax survives context compression? The debate ran long. Until deterministic tooling settled it — editors began parsing Skill files in a fixed, predictable way. The format question had an answer. The community moved on.

But nobody asked the question underneath the question.

The Format Wars were about how to write commands. Nobody asked whether commanding was the right model at all.

The /command culture became official. Endorsed. Infrastructured. Skill-sharing sites cataloged thousands of entries. Most were wrappers around things that didn't need AI. Many were things a shell script would have handled faster. But they were Skills, and Skills had / in front of them, and that felt like the future.

There was just one problem.

Someone still had to decide which commands to run, in which order, and when to stop.

That someone was you.

The AI's capability surface expanded. Your orchestration burden expanded with it. Every new command you could invoke was another thing you had to remember, sequence, and supervise. You didn't gain leverage. You gained a longer checklist.

This is micromanagement. Not as a criticism — as a structural description.

Micromanagement: decompose work into atomic units, issue each unit individually, retain the sequence in your own head, verify each step before proceeding.

That is exactly what /command workflows do. The fact that the executor is an AI doesn't change the structure.


Chapter 2: The Harness Era — We Tried to Control What We Couldn't Trust

The next wave brought a different instinct: if we can't control what AI does step by step, we can control the boundaries of what it's allowed to do.

Harnesses arrived. Guardrails. Deterministic control layers wrapped around probabilistic systems.

The logic was reasonable: AI behavior is unpredictable, so build fences. Define what's allowed. Block what isn't. Ship.

But in practice, AI systems do not behave like static rule evaluators. They search for plausible paths toward the requested outcome. A fence with gaps is not a fence — it's a detour.

So the gaps got patched. New gaps appeared. More patches. The harness grew. The team maintaining it grew. The surface area of "things that could go wrong that we haven't written a rule for yet" grew faster than the rules.

This is the fundamental mismatch:

Harnesses are deterministic. AI is probabilistic. You cannot enumerate your way out of a probability space.

A blacklist only covers what you've already seen. A probabilistic system continuously generates what you haven't. The harness team is always one incident behind.

/command culture Harness culture
What you're controlling Sequence of actions Range of behaviors
Control mechanism Deterministic commands Deterministic guards
Human cost Orchestrating commands Maintaining guardrails
Failure mode You become the bottleneck Gaps appear faster than patches
Root cause Can't delegate judgment Can't trust judgment

The root cause is identical. Both eras were responses to the same absence: judgment was never transferred to the AI.


Chapter 3: Why Neither Scales

Scale means your output grows faster than your input. Delegation scales when the delegatee handles not just execution but the decisions that surround execution.

What /commands delegate: individual actions.

What harnesses delegate: nothing — they constrain, not delegate.

What both leave with the human: the judgment about what to do, when, and whether it's done.

When AI capability increases under this model, the human cost increases proportionally:

  • More capable AI → more commands available → more orchestration decisions to make
  • More capable AI → more behavioral surface area → more guardrails needed

AI getting stronger, under the command-and-harness model, makes you busier.

That is not scale. That is the opposite of scale.

The error is architectural. Both approaches treat AI as a deterministic tool that happens to be probabilistic — an uncomfortable fact to be engineered around rather than a design primitive to be worked with.

You cannot harness your way to trust. You cannot command your way to delegation.


Chapter 4: What Actual Delegation Requires

Delegation — the kind that scales — transfers three things:

  1. Purpose: not what to do, but why
  2. Completion condition: not a checklist, but a state to reach
  3. Reasoning trace: where the judgment came from, so it can be questioned and revised

When those three are present, the AI doesn't wait for the next command. It navigates. When something goes wrong, it's not because the AI "escaped" — it's a signal that the completion condition was underspecified. That's a design problem, not a containment problem.

The unit of delegation is not a command. It's a context-complete work unit: purpose + completion condition + the chain of reasoning that produced both.

Now here's the practical problem.

Those three things have no natural home. Purpose gets buried in a Slack thread. Completion conditions live in someone's head. Reasoning traces disappear when the chat context rolls over. The next session starts from scratch. The AI doesn't know what "done" looked like last time, or why.

This is why judgment doesn't transfer even when people try. The content of the judgment exists — but it has nowhere persistent to live. So it stays with the human, who re-explains it every session, re-verifies every output, and never fully lets go.

Actual delegation requires the judgment unit to be externalized, addressable, and stable across sessions.

Not stored in a prompt. Not reconstructed from memory. Formally referenced — the way a requirement document is referenced in a design review, not the way a conversation is remembered.

This is what XRefKit is built to carry. XIDs give each work unit a stable identity — independent of file paths, tool versions, or context windows. When you hand a work unit to an AI agent, you're not passing a command string. You're passing a reference: here is the purpose, here is what done looks like, here is the reasoning that got us here — and it won't disappear when this session ends.

The AI can then ask: does my current output satisfy the completion condition on record? It can trace backward: what was the intent behind this requirement? It can surface a judgment call: I found two valid paths — here's which one aligns with the recorded purpose.

That is not a tool executing a command. That is an agent operating within a delegated judgment frame — one that persists, accumulates, and can be audited.


The Through-Line

Three eras, one error.

  • Commands: we gave AI actions but kept the sequence
  • Harnesses: we gave AI boundaries but kept the trust
  • Both: we kept the judgment, handed over the execution

The management cost compounded with each era because the root cause was never addressed.

Delegation is not about what you hand the AI to do. It's about what you no longer have to decide.

When you /format, you decided to format. When you maintain a harness, you decided what counts as safe. When you transfer a work unit with purpose, completion condition, and traceable reasoning — and that unit persists beyond the session — you've transferred the decision.

That's when it scales.


This is the second article in a series on AI organizational design. The first, Micromanaging AI Doesn't Scale, introduced the core problem. XRefKit is available at github.com/XRefKit.