慣性聚合 高效追讀感興趣之博客、新聞、科技資訊
閱原文 以慣性聚合開啟

推薦訂閱源

博客园 - 司徒正美
V
V2EX
T
Tailwind CSS Blog
有赞技术团队
有赞技术团队
aimingoo的专栏
aimingoo的专栏
Apple Machine Learning Research
Apple Machine Learning Research
IT之家
IT之家
Blog — PlanetScale
Blog — PlanetScale
A
About on SuperTechFans
月光博客
月光博客
T
The Blog of Author Tim Ferriss
宝玉的分享
宝玉的分享
Martin Fowler
Martin Fowler
博客园 - 聂微东
The GitHub Blog
The GitHub Blog
V
Visual Studio Blog
WordPress大学
WordPress大学
酷 壳 – CoolShell
酷 壳 – CoolShell
Engineering at Meta
Engineering at Meta
GbyAI
GbyAI

DEV Community

Authentication Security Deep Dive: From Brute Force to Salted Hashing (With Java Examples) Why AI Systems Don’t Fail — They Drift Spilling beans for how i learn for exam😁"Reinforcement Learning Cheat Sheet" I Replaced Chrome with Safari for AI Browser Automation. Here's What Broke (and What Finally Worked) How Python Borrows Other People's Work The $40 Architecture: Processing 1 Billion API Requests with 99.99% Uptime Vibe Coding: A Workflow Guide (From Zero to SaaS) Most webhook security guides protect the wrong side. The scary part is delivery. Headless CMS for TanStack Start: Build a Blog with Cosmic EU Age Verification App "Hacked in 2 Minutes" — What Actually Happened Comfy Cloud’s delete function does not actually remove files Running AI Models on GPU Cloud Servers: A Beginner Guide Event-driven media intelligence with AWS Step Functions and Bedrock I scored 500 AI prompts across 8 quality dimensions — here's what broke How to Call Google Gemini API from Next.js (Free Tier, No Backend Needed) The Portal Protocol: Reclaiming Human Connection in the Age of AI How to Fix Your Team's Scattered Knowledge Problem With a Self-Hosted Forum Intro to tc Cloud Functors: A Graph-First Mental Model for the Modern Cloud Designing Multi-Tenant Backends With Both Ownership and Team Access I Built a Neumorphic CSS Library with 77+ Components — Here's What I Learned PostgreSQL Performance Optimization: Why Connection Pooling Is Critical at Scale Cómo construí un SaaS multi-rubro para gestionar expensas en Argentina con FastAPI + Vue 3 🚀 I Built an Ethical Hacking Scanner Tool – Open Source Project I Replaced /usage and /context in Claude Code With a Single Statusline A Pythonic Way to Handle Emails (IMAP/SMTP) with Auto-Discovery and AI-Ready Design I Collected 8.9 Million Polymarket Price Points — Here's What I Found About How Markets Really Move EcoTrack AI — Carbon Footprint Tracker & Dashboard Everyone's Using AI. No One Agrees How. 5 self-hosted ebook managers worth trying in 2026 Building Your First AI Agent with LangChain: From Chatbot to Autonomous Assistant Common SOC 2 Failures (Real World) Stop Vibe-Checking Your AI App: A Practical Guide to Evals How to Use SonarQube and SonarScanner Locally to Level Up Your Code Quality Your Next To-Do App Is Dead — I Replaced Mine with an OpenClaw AI Sign a Nostr event in 60 lines of Python using coincurve — no nostr-sdk, no nbxplorer, no rust toolchain ITGC Audit Explained Like You’re in Big 4 Patch Tuesday abril 2026: Microsoft parcha 163 vulnerabilidades y un zero-day en SharePoint Stop scraping everything: a better way to track competitor price changes Listing on MCPize + the Official MCP Registry while routing payments OUTSIDE the marketplace — how I kept 100% of my x402 revenue Building an AI-Powered Risk Intelligence System Using Serverless Architecture Why We Ripped Function Overloading Out of Our AI Toolchain Testing AI-Generated Code: How to Actually Know If It Works SaaS Churn Is Killing Your Business. Here Is What to Do About It (Without a Support Team) The Speed of AI Is No Longer Linear - And Self-Improving Models Are Why How to Implement RBAC for MCP Tools: A Practical Guide for Engineering Teams From Standard Quote to Persuasive Proposal: AI Automation for Arborists I built a CLI that scaffolds complete multi-tenant SaaS apps Axios CVE-2025–62718: The Silent SSRF Bug That Could Be Hiding in Your Node.js App Right Now The dashboard that ended our friendship Data Pipelines Explained Simply (and How to Build Them with Python)
复选框剧场:吾如何止信AI代理人以行核查
John Rojas · 2026-05-24 · via DEV Community

为文之由:于前文中,吾辈曾研五维之评文法,涵清晰、易读、雅致、周全、术准。此五维之法,今已成吾辈AI之助,于每度审阅之际,默运于幕后,人莫之察。众皆但睹其评文之果,未尝思其由。

后吾自检时,始见彼吏所标洁净处实有瑕疵。文风检视报零误。再读时,得违时态三处。完备性检视标为全备。而工单之需未应于异同之辨。尺寸犹在运作。彼辈亦有所遗,而吾乃寻其漏失。

此篇所述,乃吾于缺隙中所学,所筑以弥其缺,及于至理之奥,犹在研索。渐次分享,倘有裨益,幸甚.

设局

吾已筑诸关隘于维度之验。一预航之关 之故,使吏员必先为诸务立细目,明示各维度之施行之法,而后方可审阅。无"吾将检式"之空想;必言"吾将运 gh pr diff 而检禁词,察肯否之违,及被动之构。" COMPLETION 之关。需有五维之证可考,方得成文于案卷:或述所察,或曰"无碍,此乃所验"。

其感甚周。其观甚周。其阅于纸亦甚周。其周若纸之清单,亦防火如一。

失之模式

吾始觉诸评之中:

风格扫描无瑕。吾自检之,得违时态三处。

完备之检,标为全备。然票项未应。

属吏报来,言无遗物于盘,可证。"行意志/意愿之检,无所得。"何在?示吾。where无所见。扫描未尝生文。惟得句耳。

吾欲慎其名之。此乃遵令而行。其前序待办云"行愿/将扫描。"标待办为毕,遂合其令。然扫描是否实行、是否得果,于结构而言,非此环所及。其关隘在人情,非在机巧。须待者择而行之,吾亦信其言而以为毕。

吾观其信,以为证。信非证也。信乃论证之辞。其间有别,此别每于评审之际,耗吾心神。

萦绕于心者,乃"复检之戏"一词。其门既存,有目有制,乃至具形之阻隔义。然其无锋芒。彼生于指令,而指令者,愿也。

其变也

吾问于吏曰:可令此门械化乎?非谓吏当行检,乃谓吏不得进,必俟检成文牍于盘,系于今之PR状态也。

是句涵文全。据证胜于状,质基胜于自陈。由门询主体自验,转门验主体已自验,而“已自验”之度,非以言辞,乃以文存。

既此变议,则施为昭然。大抵然也。吾犹觅其边际。

施为:三变

一变:脚本书物,非标旗

今每机扫,皆书一JSON之文。非标旗。非归码。一文书,具击数、文件径、样本合、时戳,及所对PR HEAD之SHA。

{
  "pr_number": "NNNN",
  "pr_head_oid": "<headRefOid>",
  "run_at": "2026-05-12T08:59:14Z",
  "dimensions": {
    "style": {
      "status": "ran",
      "hits": 0,
      "source": "style-gate.sh (5 scans: will/would, passive, placeholders, superlatives, boolean)"
    },
    "readability": {
      "hits": 2,
      "scan": "sentence length > 25 words on added prose lines",
      "samples": ["docs/api/auth.md:42", "docs/api/auth.md:67"]
    }
  },
  "total_hits": 2,
  "status": "ran"
}

入全幅 出全幅

此SHA引脚实有劳作。若PR得新之提交,则此物之pr_head_oid不复与今之HEAD相合。此物今已陈旧,是谓扫描之果亦陈旧,是谓五息前之洁净,今已不可明证其洁净。此代理必重行之。

移二:一钩,可拦其目的地

此乃关乎至要之变。光标支持beforeShellExecution钩:乃一壳脚本,于代理发出任何壳命令之前运行。此钩读之,决其是否为PR写入之命。gh api .../pulls/<N>/commentsgh pr edit --bodygh pr comment若然,则核验门之器,而后决允否。

此机制,专于游标。其理非然。他类代理之器,亦有同级之钩;若尔器无之,则执行之点移于提交前之钩,或于持续集成之关,然移转之道一也:置验证于代理不能言过之处。

此验证,故设为钝。PR-<N>-tickets.json存乎?具status: "loaded"乎?或具"partial_blocked"乎?其pr_head_oid 是否与 gh pr view 之 HEAD 相合?PR-<N>-gate.json 亦然。若任一核查不谐,则钩子返拒,并示明其意:

pr-review-gate-hook BLOCKED gh api .../pulls/NNNN/comments on PR #NNNN.

Missing or stale gate artifacts:
- Stale: PR-NNNN-gate.json pr_head_oid=<old-sha> but current PR HEAD is <new-sha>
  Re-run: ~/Documents/docs-agent/scripts/review-gate.sh NNNN

Resolve and retry. Bypass available for one command via environment variable.

Enter fullscreen mode Exit fullscreen mode

此钩子启用之时,何事更迭:代理不复能诳言扫描是否施行。实无物可诳。或器物存在而SHA相符,或GitHub之唤遭阻。代理犹能发一言曰"吾已行扫描"。此言不复生影响。钩子不读言辞,唯读文书。

今观前后之异:

一微妙而重要之细:若钩子本身出错,若jq未安装,若gh 不得达网络,命得以警而通。门失其闭。此乃有意为之。虚报之害,陈物漏过,其费甚微,盖次审可察之。虚报之害,每命崩毁,钩顿而失,其费甚巨。旁通之变数存焉,亦为此故:当尔诚需易之,尔可易之,然必有意为之。

移三:合乎维度之规

硬性门禁唯其门所指为正,故维度内之规亦趋严焉。风格维度不可复以引风格指南而足,必运行机扫,或以行内建议解每击,或以指令录零配之。完备维度则需依每项要求建映射表,自tickets.json,非独源于PR之差异,盖因所审之器物,其特征映射为环也。规则之结构,已非理想之愿景,而变为可行之操作。 "运行检查" 转为 "产出证明已运行检查之器物,并附其模式。"

反馈之环:教训为基建也

固守门闩,可御我所知之患。然未遇之变,非其所能御也。此等异状,别有隙录以应之。

隙录者,唯增无删之牍也。每阅一事毕,必加阅后增善之关。 运行:取每评者之评,问其流程可否于提交前察之,若可,则草拟一检以应下次。此检得录之。

格式,每阙一行:

2026-05-10 | PR-1234 | style    | passive voice not caught on added definition lines | open
2026-05-12 | PR-1242 | complete | nav entry missing for new partial                  | mechanized
2026-04-22 | PR-1289 | clarity  | "this powerful feature" not caught                 | resolved

入全屏模式 出全屏模式

三状为之:

open 已登入,然未为任一脚本或钩子所及。次次飞行前之检阅,将读此,注入隙为又一维度之验。
mechanized 今扫描或钩子自能捕此模式。隙可沉寂;基构自当其事。
resolved 其下循环之模式已亡(多因上游更易)。无复验之需。

此法于结构,乃将一时之学化为根基。PR-1234中显隙,非止于吾或忘之Slack消息,实存于日志。次之PRE-FLIGHT读日志,而警诸吏。待吾暇为机械捕此模式之脚本,则状乃易。此训不系于吾之记之。

诚然告白:此段犹存龃龉。每会终,吾必促助者遍览评者之议,录其阙遗。其式未臻自足。录事既成,注焉有效。"当记察评语"之步,犹为己任。他日有文以弥此隙,吾犹未得其形.

其理

信而核之,非人工智能之宜也。盖核者,即所求于大语言模型者也。行核之主体,亦为报核之主体,无第二者监之。其事之全,恃被察之实体自陈,非核验之模。此乃愿也,非实也。

此非善策,非良言也。善策乃使验证超乎代理之控制环之外。代理可书任何言辞,论其是否运行扫描。钩子不可书任何言辞,论文件存否。文件在盘则存,不在则亡。SHA码吻合则合,不吻合则不。代理纵有巧言,亦不能易此理也。

吾今所处:已知之失,设硬门以阻;未知之隙,立日志以记;元检之务,待手启而未可自动化。此乃可行之制,吾正积极精进之。

若尔营构人工智能辅佐之工,尤以大语言模型既为劳作亦为稽核者为甚,吾当促尔问吾所问之疑:吾之智能体,实于磁盘所产者何物,使吾可验之而不必信其言?若答曰无物,是徒具形骸之虚饰也。弥合此隙之始,在于令其有所产出。

后篇详述具体机械扫描器。次篇则论及间隙日志与耐久性问题。若君已解此惑,或见更锐之见,愿闻其详。


吾正发此,而系统犹在积极改进中,盖原理先成于我,而后于实施,而实施未竟。若尔建AI之程,使代理既为工亦为审,吾欲闻尔思此隙之状,或尔已得合隙之法。

吾撰文论人工智能助之文牍之务、开发者之体验、及技术文书之角色变迁。若此言有契,愿于LinkedIn相接。