
























0xkvyb 10 minutes ago [–] I’m experimenting with a small self-harness repo based on this paper. The idea is to run simulated users through an agent harness, collect the traces, group the recurring failures, and use that to propose small harness changes with regression checks. Still early, but I’d be interested if anyone else is thinking about this workflow for agent development. |
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。