





















It is a separation-of-duties argument for code written by AI agents and also reviewed by the same AI agents.
If you ask Agent1 to write code, the PR diff coming out of it should really be reviewed by Agent2 (or 3 or whatever), the reviewer should be a different system than the code gen agent.
The core claim is that a model is a poor judge of its own work — it shares the same priors and blind spots that produced a bug, so self-review trends toward a shallow review rather than catching blind spots.
The system that writes the code shouldn't also be the thing that signs off that it's safe to ship.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。