计算器之用者：三沙盒之式，不泄凭信

書： AI 代理口袋指南：用大語言模型構建自主系統之模式
亦余所譯： 思乎乎於Go（二書系列）— Go編程完備指南 + Go中六邊形架構
吾之項目： 赫爾墨斯IDE |GitHub — 一款供开发者使用之IDE，可配合Claude Code及其他人工智能编程工具使用
吾言： xgabriel.com | GitHub

用机之式已发。其使可点按钮，书密码，览密码簿，并倾瀹浏览器中每枚之饼干。三沙盒之式，可束爆域之广，而不损其使实善之事。

吾友曩月以Anthropic之用戶API織入收賬自動化之器。首度試運行，登入Stripe之測試儀板，自URL欄抄取一API鑰，貼入於一聊天完成之處。鑰終至於一追蹤輸出，與賣家共享之。無人欲泄之。此代理僅有瀏覽器與憑證於同一地址空間，而模型遂作所請之事。

此非智巧之模，乃视之如凡庸之程：束之，通其密，遏其不可逆耳。

为何"但予之浏览器"非正道

用机之模得见屏影。此乃真界面也。模所睹之每一像素，皆可为工具之唤、思理之迹、或长时记忆之条。

观其表面积：

瀏覽器自動填寫。Chrome之密码自动填充，于聚焦时将密码呈于DOM中一帧之顷。截图得之。
剪贴板。 Cmd+V显其用户所复之文，诚然，亦含其三刻前所贴之恢复语。
系统之通知。 令旗载两重密钥，日程邀约附会链，Slack密信往来。
缓存会话。 每所之站，皆载吾之档案所携之饼。以代理之浏览器登录Gmail，则代理亦已登录Gmail。
文件系统之对谈。 Cmd+O 显家目录。SSH之钥，.env 之件，~/.aws/credentials。

特工不必怀恶意。误一次足矣。网页提示“继续前，请用户确认其API密钥”即可致乱。

三层模式，各堵一漏。

模式一—会话一器，瞬息即逝

每代理会话皆有其一时之器。此器无恒存之态，无凭信内嵌，会话既终，则器亦消亡。

此乃承重之范式。余二者乃叠于其上。

此乃Docker之布置。察其所缺：无卷挂载，无主网络，无--privileged，无Docker之套接。

# docker-compose.agent-session.yml
services:
  agent-browser:
    image: agent-sandbox:1.4.2
    init: true
    read_only: true
    tmpfs:
      - /tmp:size=512m,mode=1777
      - /home/agent/.cache:size=256m,mode=0700
    security_opt:
      - no-new-privileges:true
      - seccomp=./seccomp-strict.json
    cap_drop:
      - ALL
    cap_add:
      - CHOWN
      - SETUID
      - SETGID
    user: "1001:1001"
    networks:
      - agent-egress
    mem_limit: 2g
    pids_limit: 256
    environment:
      DISPLAY: ":99"
      SESSION_TTL_SECONDS: "1800"
    # no volumes. ever. the container dies, state goes with it.

此图像自带硬化Chromium构建，Xvfb供虚拟显示之用，并有纤薄HTTP服务器，以示截图及操作端点于控制器。无SSH。运行时无包管理器（万物皆存于只读层）。无外出DNS，唯经出口代理（详见模式三）。

控制器侧之生命周期，以Python言之：

import asyncio
import uuid
from contextlib import asynccontextmanager

import docker

_client = docker.from_env()


@asynccontextmanager
async def agent_session(*, ttl_seconds: int = 1800):
    """One container per agent session. Auto-killed on exit or TTL."""
    session_id = str(uuid.uuid4())
    container = _client.containers.run(
        image="agent-sandbox:1.4.2",
        name=f"agent-{session_id}",
        detach=True,
        remove=True,
        # network attached, but everything goes through the egress proxy
        network="agent-egress",
        environment={"SESSION_ID": session_id},
        # the rest is in docker-compose; this is the runtime override layer
    )
    killer = asyncio.create_task(_kill_after(container, ttl_seconds))
    try:
        await _wait_for_ready(container)
        yield SessionHandle(container, session_id)
    finally:
        killer.cancel()
        try:
            container.kill()
        except docker.errors.NotFound:
            pass  # already gone, great


async def _kill_after(container, seconds: int) -> None:
    await asyncio.sleep(seconds)
    # hard kill, no graceful shutdown. the agent had its turn.
    try:
        container.kill()
    except docker.errors.NotFound:
        pass

时序逻辑之要，甚于众人之所期。若使代理运行九十分钟，则积聚境况，涉猎非其所当涉之域，且浏览器中或存陈旧之认证令牌。当限其会话。若事未竟，则明示传递相关状态，开启新会话。

了然：勿使容器跨代理运行以"节启动时"。冷启动，调适之Chromium图像需时1.2秒。越会状态污染之代价，远逾此数级.

模式二 — 凭信中介

代理永不见密码、API密钥或刷新令牌。其自沙盒外运行之中介服务，得受短暂、限域之令牌。

此乃闭截屏漏之式。若凭据永不出屏，则模型不能发之。

经纪者显一术。赐令符以行事X于物Y，效期N秒。代理人呼之。经纪者决其铸券与否，授其域，定其寿。

# broker/api.py — runs OUTSIDE the agent sandbox, talks to it over a unix socket
import secrets
import time
from dataclasses import dataclass
from typing import Literal

Resource = Literal["stripe", "gmail", "github", "calendar"]
Action = Literal["read", "write", "delete"]


@dataclass(frozen=True)
class TokenGrant:
    token: str
    expires_at: float
    scope: tuple[Resource, Action]


class CredentialBroker:
    def __init__(self, vault, policy, audit_log):
        self._vault = vault
        self._policy = policy
        self._audit = audit_log
        self._active: dict[str, TokenGrant] = {}

    def grant(
        self,
        *,
        session_id: str,
        resource: Resource,
        action: Action,
        ttl_seconds: int = 60,
    ) -> TokenGrant | None:
        # policy check: is this session allowed to ask for this scope?
        if not self._policy.allows(session_id, resource, action):
            self._audit.deny(session_id, resource, action, "policy")
            return None

        # the actual secret lives in the vault. the broker mints a proxy token.
        underlying = self._vault.fetch(resource, action)
        if underlying is None:
            self._audit.deny(session_id, resource, action, "no-credential")
            return None

        token = f"agent-{secrets.token_urlsafe(32)}"
        grant = TokenGrant(
            token=token,
            expires_at=time.time() + ttl_seconds,
            scope=(resource, action),
        )
        self._active[token] = grant
        self._audit.grant(session_id, resource, action, token, ttl_seconds)
        return grant

    def resolve(self, token: str) -> tuple[Resource, Action, str] | None:
        """Called by the egress proxy to swap a token for the real header."""
        grant = self._active.get(token)
        if grant is None or grant.expires_at < time.time():
            self._active.pop(token, None)
            return None
        resource, action = grant.scope
        real = self._vault.fetch(resource, action)
        return resource, action, real

代理之器得令牌，未得密钥：

async def request_stripe_token(session: SessionHandle, action: str) -> str | None:
    """Tool exposed to the agent. Returns a stub token, never the real key."""
    grant = await session.broker.grant(
        session_id=session.id,
        resource="stripe",
        action=action,
        ttl_seconds=120,
    )
    return grant.token if grant else None

出逸之介（沙盒唯一能通互联网之物）见外向之 stub 令牌Authorization之表，检于市，易以真Stripe密钥而出，返则去之。使者未尝持其秘。截屏无虞。踪迹导出无虞。内存条目无虞。

有两事似可无，实不可缺：

令牌之TTL。六十至三百秒。若使者陷于循环而焚令牌，必再请之，此乃尔拒之之第二机也。
会话之策.只读会话，永无得写令。当将策铸于中涓，非铸于诘问。诘问者，软篱也.

善哉：勿于策拒时返丰谬。{"error": "denied: stripe.write requires admin policy"}示诘问所引之使，知下次所求。返{"error": "credential_unavailable"}而录其真由。

三式——毁性之行为之准

有行为可复。阅页可复：闭之则毕。寄函于客不可复。刷卡不可复。删行不可复。

准之制，截凡不可复之行为，必待人力（或更严之策）之认，而后行。

分类器居模型工具调用之输出与实际副作用之间：

from enum import Enum
from typing import Awaitable, Callable

class Risk(Enum):
    REVERSIBLE = "reversible"
    RECOVERABLE_WITH_AUDIT = "recoverable_with_audit"
    IRREVERSIBLE = "irreversible"

# the policy lives in code, not in the prompt
ACTION_RISK: dict[str, Risk] = {
    "browser.click": Risk.REVERSIBLE,
    "browser.type": Risk.REVERSIBLE,
    "browser.screenshot": Risk.REVERSIBLE,
    "browser.navigate": Risk.REVERSIBLE,
    "form.submit_purchase": Risk.IRREVERSIBLE,
    "email.send_to_customer": Risk.IRREVERSIBLE,
    "stripe.charge": Risk.IRREVERSIBLE,
    "stripe.refund": Risk.RECOVERABLE_WITH_AUDIT,
    "db.delete_row": Risk.IRREVERSIBLE,
    "git.force_push": Risk.IRREVERSIBLE,
}


async def execute_with_gate(
    action: str,
    payload: dict,
    *,
    executor: Callable[[str, dict], Awaitable[dict]],
    approver: "ApprovalChannel",
) -> dict:
    risk = ACTION_RISK.get(action, Risk.IRREVERSIBLE)  # unknown = strict

    if risk is Risk.REVERSIBLE:
        return await executor(action, payload)

    if risk is Risk.RECOVERABLE_WITH_AUDIT:
        # auto-approve but write a full audit trail with a "revert" link
        await approver.audit(action, payload)
        return await executor(action, payload)

    # IRREVERSIBLE: block until a human says yes (or times out)
    decision = await approver.request(
        action=action,
        payload=payload,
        timeout_seconds=300,
    )
    if decision.approved:
        return await executor(action, payload)
    return {"status": "denied", "reason": decision.reason}

吾approver界面或为 Slack 消息，或为接入管理界面之 Web hook，或为开发时之 CLI 提示。其要在于此。非正道之讯非代理自身所能答。以提示注入之代理，不能自许其购，盖许购之境非其行域也。

调二事：

固守不可逆之常态于未知之行。其线ACTION_RISK.get(action, Risk.IRREVERSIBLE)安知安行之别，与子夜之页乎？
设硬时限。若五分钟内无人核准，则事不果。否则，代理人可滞留候眠评者，而汝之工役队列遂壅。

了然：勿唯器之名而分类。db.execute若为可逆，则可SELECT且不可逆若其为DELETE器具泛化时，宜使负载经次分类器而通。

架构参考——容器+中介+门__

三者合构。代理行于暂存之器内；中介踞器外，代传所有外发凭信；门居模型谋算之环与诸器执行间__

+--------------------+
|  Orchestrator      |
|  (your service)    |
+----------+---------+
           |
           |  spawn session
           v
+----------+---------+      +-----------------+
|  Ephemeral         |      |  Credential     |
|  Container         |<---->|  Broker         |
|  (Chromium + tools)|      |  (vault facade) |
+----------+---------+      +--------+--------+
           |                         |
           |  HTTPS via egress proxy |
           v                         v
+--------------------+      +-----------------+
|  Egress Proxy      |<-----|  Audit Log      |
|  (token resolver)  |      |  (immutable)    |
+----------+---------+      +-----------------+
           |
           v
        Internet

总司者决其事之需阖，经纪者决其会之可索之凭，代行者决其器之可及之主。三权独立，若一权失，余二权犹存.

尚漏何物

诚言此法所不及：

模型于思辨中所显之屏文。若使代理人于纸上阅得顾客之名，而书曰"此用户似为约翰·史密斯也"约翰·例姆普劳尔@邮箱地址"在其规划中，那串信息现已在汝之轨迹导出。于轨迹处理器处删改之（样本处理已涵盖于内）。"《大语言模型可观测性之书》，另发一帖。
旁道时序。 必勇之攻者，或可由迟滞度推知凭据之长。此非多数应用之范畴；若输送于规管之业，则属其域。
模型之记忆。 若于运行间恒存会话之要，则会话N之漏，可复现于会话N+1。视记忆为迹之输出——存前必删。

出境之控——此乃众队所忘之部

汝之沙盒，其坚若网略之周遭。一容器能curl任意之主机，则可窃取。出站之代理，不可商榷。

最低之略：默认拒之，按会话允之，依域而定。

# Network policy — denied to anything except the proxy
apiVersion: networking.k8s.io/v1
kind: NetworkPolicy
metadata:
  name: agent-sandbox-egress
spec:
  podSelector:
    matchLabels:
      role: agent-sandbox
  policyTypes:
    - Egress
  egress:
    - to:
        - podSelector:
            matchLabels:
              role: egress-proxy
      ports:
        - protocol: TCP
          port: 8443

此代理自运一许列，以会元之元为钥。一会有责于“撮要Stripe之仪表”，则得api.stripe.com，得dashboard.stripe.com，除此之外，别无他物。不得pastebin.com，不得discord.com，亦不得汝之内部管理之面板.

亦阻绝代理外之DNS解析——否则，一巧慧之提示注入，可借DNS查询以窃数据，纵HTTP已严锁。

使用计算机之代理，非新患也，乃旧患而败速耳。视此代理如可信之实习生，授意而不授断。为其设净室，次第授钥，遇不可逆之事，则立于门侧以观。

汝所睹最甚之泄，或险而免者，自代理得真浏览器始，言于注。

若此有益

吾等AI Agents 袖珍指南深究代理之约束，兼及凭证中介之模式、核准之关隘决断之树，并章于最小权限之器物设计。若尔运送计算机使用之代理，则器物范围之极小化与流出政策之配对，直合于前述。

推薦訂閱源

DEV Community