惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

DEV Community

RAG 시스템 실전 구축 (v42) Cx Dev Log — 2026-04-23 Why Tesla Is Becoming the AI Enterprise Case Study Every Leader Should Understand ORA-00214 오류 원인과 해결 방법 완벽 가이드 SpecAgnt v2.0: The Agent Lifecycle Framework for AI-Native Engineering Optimizing Signal Latency and Weight Allocations in Algorithmic Pipelines SSH Under the Hood: Protocols, Mechanisms, and the Full Technical Story دليل بوابات الدفع للتاجر العربي في 2026 (وكيف تختار المناسبة لمتجرك) Cómo Mi Configuración de Docker Me Salvó de un Ataque de Supply Chain (Y Por Qué la Tuya Debería Hacerlo También) How My Docker Setup Saved Me From a Supply Chain Attack (And Why Yours Should Too) Astro: The epitome of SEO Technical Update I Gave My AI Agent the Ability to Research Before It Writes — Here’s What Changed Kubernetes sem Cloud Provider (Parte 2): Criando Operators em Go para automação e self-service de plataforma AI Memory Needs an Authority Policy, Not Just More Context You've done tutorial after tutorial. Your GitHub is still empty. (Free 1‑page PDF, no signup) TypeScript 7.0: The Go Compiler That Makes TS 10x Faster Connecting Wallets the Right Way: wagmi v2 and EIP-6963 The 5-Layer Architecture Every Production Multi-Agent System Needs (And Why Most Skip Layers 4 and 5) CSS Scroll-Driven Animations: No JavaScript Required Vite 8 + Rolldown: Rust-Powered Builds That Are 10–30x Faster Core Architectural Components of Azure My Skills How I Use AI as a Senior Engineer Construí um motor ATS determinístico porque estava cansado de adivinhar por que meu currículo era rejeitado SCS-Lab1 — CloudTrail: Trail + S3 + KMS + Log Validation LuisCore MCP server — daily syndication · 2026-05-25 Cursor vs JetBrains Rider for C#/.NET in 2026: which to pay for I built a local-first movie recommender with Corrective-RAG (cited explanations, hybrid retrieval, runs entirely on Ollama) Scaling to 1 Million Users : Load Balancing & Caching Strategies How the Events Table That Looked Right Killed Our Queue Three Failures My AI Memory System Caught — And the Flaw It Revealed in Itself dotnet Framework life cycle tool LangGraph 워크플로우 템플릿 (v41) I built a free image compression API — no signup, just curl Designing TikTok from Scratch — A System Design Deep Dive PREDICTION-20260525-0007: boredom-with-asymmetric-leverage [2026-Q3 through 2027-Q3] [Boost] How to integrate the QuickBooks Invoice API in 2026 How I Cut My Anthropic API Bill by 50% With a Local Python Tool Vibe Coding Problems: 7 Visual Bugs AI Code Generators Always Ship Chinese AI Models 2026: The Agentic Revolution, Hardware Independence, and What It Means for Global Developers The Quiet AI War Inside Your Browser The 12-Line Anti-Bot Trick That Saved Our Airdrop Snapshot From Sybil Farms Building a production-ready SaaS dashboard in Next.js 16 — Recharts, TanStack Table, dark mode, and collapsible sidebar Why 2026 Belongs to Agentic AI (And How to Build Your First Local Agent) It Was 2024 When We Tried to Outsmart the Treasure Hunt Engine RAG 시스템 실전 구축 (v40) I Found a Tool That Generates a Complete .NET 8 or Java Spring Boot API From SQL Schema in 30 Seconds I Added a 4th Agent That Audits My Other Agents. It Caught My Strategist Procrastinating for 3 Weeks. Streaming LLM responses to the browser in Go (Server-Sent Events) How We Publish and Manage Educational Admission Updates at Scale on DailyAxom A prompt is not a conversation. It's a component contract. How to Pass the EAA 2025 Accessibility Audit — A Step-by-Step WCAG Checklist Building an Autonomous MCP Lead Generation System with Hermes Agent LangGraph 워크플로우 템플릿 (v40) How I Built 100 Browser-Based Image Tools With No Server (FFmpeg WASM, PDF-lib, AI Background Removal) Nginx CVE-2026-9256, AI Prompt Injection Defenses, and Claude AI Data Leak Demo Scaling RAG for 10M+ Docs, .md Agent Memory, & Claude Code for Motion Graphics Diagram as Code with draw.io DuckDB Delta, PostgreSQL 17 Migration, & SQLite Optimization Deep Dives Windows 11 Microsoft Account Login Recovery During Internet Restrictions The Linux Commands You Forgot Exist (And Why AI Workflows Make Them Relevant Again) Spec-Driven Development Without an IDE: I Generated NestJS, Go, Spring Boot, Laravel, and Rust Apps From a Single PRD File Components are states Edge SEO y Middleware: Cómo Interceptar a Googlebot y LLMs antes de llegar a tu Servidor Context window exceeded at turn 23. Here's how I track token usage without a tokenizer. My Hermes agent spent $3 before I noticed. Now it can't. My Hermes agent's stop condition was a 40-line if/elif chain. I replaced it with 3 lines. My agent kept hitting context limits. This one function fixed it. Create and configure Azure Firewall Your Hermes agent's audit log is leaking customer emails. Here's a 100-line lib that fixes that. My agent kept forgetting what it was doing. A scratchpad fixed it. I replaced 200 lines of ad-hoc state management in my Hermes agent with one object. Per-Key Rate Limiting for Agent Tool Calls: Stop One User From Breaking Everything Composable Output Guardrails: Filter Agent Responses Before They Reach Users Sanitize Your LLM Message Lists Before Every API Call Thread a Run ID Through Every Agent Call So You Can Debug Anything Normalize Provider Error JSON So Your Agent Can Actually Handle Failures Priority Queue for Agent Sub-Tasks: Stop Processing Low-Priority Work First Static Lint Rules for Your LLM Prompts (Before They Hit Production) tool-call-budgets: Stop Runaway Agent Loops Before They Hit Your Invoice Step Through Your Agent's Failures Like a Debugger The Simplest Stop Condition: A Hard Cap on Agent Loop Iterations Score Your Agent's Responses With a 0.0-1.0 Rubric (No LLM Judge Required) Fix Bad Structured Output by Feeding the Error Back to the Model Building an effective Storyblok Tool Plugin with SvelteKit How to Get Your Renault / Dacia Radio Code for Free RAG 시스템 실전 구축 (v39) Retraction — scrml’s Living Compiler I built a fitness app where the AI roasts you for eating pizza (and hypes you when you PR) The Top SaaS Founder Communities on Discord (Beyond the AI Hype) I Built a Production-Grade Async Job Queue from Scratch — Here's Everything That Actually Happened How to watch SMS from multiple Android phones in one iOS app We Didn’t Want Another AI Wrapper — So We Explored a High-Speed Hermes Orchestrator for Engineering Crews Multi-tenant além do TenantId: problemas reais e aprendizados em sistemas .NET After failing 23 times, I am sharing How I Actually Prepare for a Tech Interview Every Single Time Now. I built an app that works like a nutritionist for your brain. Here's what happened in 7 days. GoBadge Dynamic: From Module Stats to Universal Badges LangGraph 워크플로우 템플릿 (v39)
copilot cloud agent is becoming an automation api
Paulo Victor · 2026-05-26 · via DEV Community

GitHub quietly crossed an important line this month: Copilot cloud agent tasks can now be started through a REST API.

That sounds like a small product update. Another endpoint. Another preview feature.

But I think this is one of those boring-looking changes that tells you where the whole category is going.

Once an agent can be started by an API, it becomes automation infrastructure.

Not "ask the assistant to fix this file."

More like: an internal developer portal creates a repo, opens a tracked agent task, watches progress, and collects the pull request. Or a migration script fans out dependency upgrades across repositories. Or the release workflow asks an agent to prepare the weekly PR.

That is useful.

It is also where the real problems begin.

automation intensifies

the ui was the training wheels

Most teams first meet coding agents through chat.

That is fine. Chat is a good way to learn the shape of the tool. You ask for a refactor. It proposes a diff. It runs tests. You decide whether the work is good enough.

The human is still the scheduler.

The human picks the task, frames the boundary, and decides when to stop. There is friction everywhere, and that friction hides a lot of missing platform design.

An API removes some of that friction.

Now the agent can be triggered by another system. That means the agent can be queued, retried, templated, rate limited, observed, and embedded inside existing engineering workflows.

This is the moment where "AI coding assistant" begins to look like "background worker that happens to write code."

And background workers need boring things.

They need ownership. They need permissions. They need idempotency. They need logs. They need a reason to exist when someone finds them running at 3 AM.

the task boundary becomes the product

The interesting part is not only that the API starts a task. The interesting part is what counts as a good task.

"Go modernize our services" is not a task. It is a wish with a repo attached.

"Upgrade this dependency from version X to Y in these 14 repositories, run the standard tests, do not change public behavior, and open one PR per repo with a checklist" is closer.

That difference matters because agents are unusually good at making vague work look busy. If the boundary is loose, the agent will fill the space with plausible activity: extra cleanup, side refactors, generated tests, package changes, formatting churn, maybe a tiny architecture opinion nobody asked for.

Sometimes that is helpful. Often it is how a small automation becomes review debt.

The API makes task design a platform concern. Internal tools will need task templates with explicit scope:

  • which repositories are eligible
  • which files can be changed
  • which commands can run
  • which tests are required
  • which labels and reviewers get attached
  • which changes require human approval before the agent continues
  • when the task should stop instead of improvising

That sounds heavy until you compare it with the alternative: every team inventing agent prompts inside scripts with no consistent review model.

programmatic agents need queue discipline

Once you can start agent tasks from automation, you need to decide how many should run.

This is not only a cost question. It is also a human review question.

If a migration script opens 80 agent-written PRs in one afternoon, did the team become more productive or did it just move the bottleneck into review?

The answer depends on task quality and reviewer capacity. A mechanical dependency bump with strong tests might be perfect for fan-out. A subtle framework migration across core services probably should not arrive as a surprise stack of generated PRs before lunch.

Automation APIs make it very easy to confuse throughput with progress.

The queue needs to understand the downstream system. How many agent PRs can this team review today? Which repos have owners available? Which changes are blocked by release freezes?

That is why I would not wire an agent task API straight into a button called "fix everything."

I would put it behind a queue with explicit task, repository, reviewer, retry, and blast-radius budgets. The point is not to make agents slow. The point is to make agent work land in a shape humans can actually absorb.

identity is not a footnote

The GitHub preview supports personal access tokens and OAuth tokens today, with GitHub App installation access tokens coming later.

That token detail looks small, but platform teams should care.

If an internal portal starts an agent task, whose authority is it using? The developer who clicked the button? The portal service account? A GitHub App installed on approved repos?

The answer changes the audit story.

When the agent opens a PR, modifies files, runs checks, or comments on a failure, the organization needs to know which human request, which automation workflow, and which policy allowed that work to happen.

"Paulo clicked a button" is not enough.

"The service-template workflow started task 8f7c for repo X, using automation identity Y, under policy Z, from request ABC" is the kind of boring sentence that keeps security people from getting nervous.

This is also why I like the broader industry movement toward sandboxing, network policies, approvals, and agent-native telemetry. Coding agents are acting inside development systems. The control plane has to know the difference between a human typing a command and an agent running a workflow on that human's behalf.

the pull request is not the whole artifact

A PR is a nice output. It is not the whole record.

For programmatic agent tasks, I want more than the diff:

  • original task input
  • triggering system
  • identity used
  • repositories and files in scope
  • commands run
  • tests attempted
  • failures and retries
  • external tools contacted
  • final confidence and known gaps

Some of that can live in the PR description. Some should live in logs. Some belongs in whatever internal system started the task.

The key is that future reviewers should not have to reconstruct the work from vibes. Six months later, someone will ask why a dependency was bumped in 47 repos, why three repos were skipped, and whether the agent followed the approved playbook.

If the answer is "check the old chat session," you do not have automation. You have archaeology with a friendly interface.

where this is actually useful

Programmatic agent tasks are best for work that is repetitive, bounded, testable, and annoying: dependency upgrades, codemods, repo bootstrapping, configuration cleanup, release preparation, and small framework migrations with a clear recipe.

That is real work, and agents can help when the review path is honest. The trap is pretending the same mechanism should handle every kind of engineering work. Some changes need deep system judgment. Some need product context. Some need an engineer to notice that the "obvious" fix violates a weird customer contract from 2021.

The API does not remove that. It gives us a better way to route the work.

what i would build first

If I were adding this to an internal developer platform, I would start with one narrow workflow the organization already understands, like "create a new service from the approved template" or "upgrade this library across repos that pass the compatibility check."

Then make the platform own the boring details:

  • a small set of approved task templates
  • repo eligibility rules
  • scoped automation identity
  • clear PR labels
  • required status checks
  • a task record that links the request, logs, PR, and outcome
  • retry rules that stop after useful failure, not after burning a day

Do that before building the grand agent portal. The value is not the button. The value is the controlled path from request to reviewed change.

the punchline

The Copilot cloud agent REST API is a small preview feature with a big implication: coding agents are becoming callable infrastructure. That is the right direction. The best agent workflows will not live only in chat windows. They will sit behind internal portals, migration tools, release systems, and platform workflows.

But once agents become automation, they inherit automation's responsibilities.

Queue the work. Scope the task. Limit the identity. Track the run. Preserve the evidence. Respect reviewer capacity. Stop when the task gets weird.

The future is not "everyone chats with an agent harder." The future is boring systems starting bounded agent tasks and treating the result like production engineering work.

Which is exactly how it should be.

Magic is nice in demos. APIs are where the responsibility shows up.

references