惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Fox-IT International blog
Recent Announcements
Recent Announcements
D
Docker
IT之家
IT之家
B
Blog
Jina AI
Jina AI
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
博客园 - 【当耐特】
Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
量子位
C
Check Point Blog
Microsoft Azure Blog
Microsoft Azure Blog
罗磊的独立博客
博客园 - 司徒正美
李成银的技术随笔
美团技术团队
Blog — PlanetScale
Blog — PlanetScale
雷峰网
雷峰网
The GitHub Blog
The GitHub Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
T
The Blog of Author Tim Ferriss
酷 壳 – CoolShell
酷 壳 – CoolShell
MongoDB | Blog
MongoDB | Blog
P
Proofpoint News Feed
L
LangChain Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Y
Y Combinator Blog
大猫的无限游戏
大猫的无限游戏
有赞技术团队
有赞技术团队
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
V
Visual Studio Blog
T
Tailwind CSS Blog
H
Help Net Security
Engineering at Meta
Engineering at Meta
小众软件
小众软件
B
Blog RSS Feed
Stack Overflow Blog
Stack Overflow Blog
月光博客
月光博客
M
Microsoft Research Blog - Microsoft Research
宝玉的分享
宝玉的分享
人人都是产品经理
人人都是产品经理
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
GbyAI
GbyAI
H
Hackread – Cybersecurity News, Data Breaches, AI and More
Last Week in AI
Last Week in AI
Martin Fowler
Martin Fowler
Stack Overflow Blog
Stack Overflow Blog

Recent Commits to openclaw:main

fix: simplify testbox changed-check delegation · openclaw/openclaw@e20b8d7 fix(mac): require packaged app resources · openclaw/openclaw@198d0a5 test: update docker stats helper expectations · openclaw/openclaw@11512b1 fix(e2e): time out live Docker runs · openclaw/openclaw@d1f2eb0 fix(mac): remove unused codesign entitlements · openclaw/openclaw@e8cb2b5 fix(docker): time out setup image pulls · openclaw/openclaw@dcf0941 fix(mac): fail closed on missing staple app · openclaw/openclaw@da16a96 fix(qa-slack): preserve failure debug artifacts · openclaw/openclaw@4ebc13a fix(e2e): time out standalone Docker smokes · openclaw/openclaw@f1ceed9 fix(mac): clean codesign entitlement temps · openclaw/openclaw@68f877e fix(e2e): time out install smoke Docker copies · openclaw/openclaw@1c5b835 fix(mac): fail closed on dmg plist reads · openclaw/openclaw@7aedff8 fix(e2e): route gateway network client through Docker helper · openclaw/openclaw@f2ad94e fix(mac): fail closed on dist plist reads · openclaw/openclaw@8e110a2 test(codex): widen app-server wait timeout type fix(e2e): time out Docker image reuse probes docs: clarify compatibility defaults · openclaw/openclaw@4f1cd8e fix(e2e): route named container cleanup through helper · openclaw/openclaw@e295c86 fix(mac): fail closed on plist stamp errors · openclaw/openclaw@91080fd fix(e2e): route focused docker smokes through run helper · openclaw/openclaw@4838e70 fix(mattermost): tag typed text slash control commands · openclaw/openclaw@21aebd5 fix(e2e): route sampled docker runs through helpers fix(e2e): clean package docker artifacts on setup failure · openclaw/openclaw@90bcec9 ci(release): include performance run in validation manifest · openclaw/openclaw@0e73379 test(e2e): harden release media memory smoke · openclaw/openclaw@99032f0 fix(e2e): clean package onboarding artifacts · openclaw/openclaw@f63754b fix(e2e): honor Docker harness run timeouts · openclaw/openclaw@b34e1b3 fix(imessage): dedupe accounts sharing the local Messages source (#86… · openclaw/openclaw@9434228 fix(scripts): detect shell-wrapped changed gates fix(codex): preserve sandbox bootstrap path style · openclaw/openclaw@3f6b63a perf: skip canonical session migration parses · openclaw/openclaw@c5530c7 fix(e2e): clean skill install package mounts · openclaw/openclaw@d3bbfa1 fix(e2e): clean Codex plugin live artifacts · openclaw/openclaw@a5653c0 fix(e2e): clean sampled Docker logs on failure · openclaw/openclaw@b93cee4 refactor: migrate validators to TypeBox (#86639) · openclaw/openclaw@3548cff fix(e2e): clean package mount tarballs · openclaw/openclaw@b377618 fix(scripts): format auth expiries on macos · openclaw/openclaw@437a9e9 fix(e2e): clean functional Docker build inputs Fix iMessage image attachment roots (#86569) · openclaw/openclaw@2e17003 chore(release): refresh plugin sdk api baseline · openclaw/openclaw@918472a fix(e2e): fail on invalid test state payloads · openclaw/openclaw@4a1d772 ci(release): apply exact extension batch excludes fix(installer): reject invalid shell options ci(release): pass vitest batch options before roots ci(release): exclude codex app-server integration from plugin prerelease · openclaw/openclaw@a3cd90f fix(imessage): send group media via attachment command (#86770) · openclaw/openclaw@17f7ef5 test(e2e): assert release upgrade installs candidate · openclaw/openclaw@41eef4a ci(release): serialize plugin prerelease extension batch · openclaw/openclaw@a46556a fix(scripts): add docker e2e scheduler help · openclaw/openclaw@81f62a6 test(codex): wait for diagnostic event locally · openclaw/openclaw@083377a test(e2e): assert release plugin uninstall removes files · openclaw/openclaw@4b03e07 test(telegram): use platform temp path in bot harness · openclaw/openclaw@16d137d fix(imessage): seed direct DM history (#86706) · openclaw/openclaw@3452382 test(codex): complete diagnostic turn explicitly · openclaw/openclaw@11b1b7c test(scripts): make run-vitest test Windows-safe · openclaw/openclaw@5c3fb1f test: restore auth regression coverage · openclaw/openclaw@c04c03f fix(test): reject missing explicit vitest files · openclaw/openclaw@505aca9 test(plugins): canonicalize plugin install assertion paths · openclaw/openclaw@5174d97 fix(diagnostics): track model stream progress (#86757) · openclaw/openclaw@23e9bc8 Preserve runtime external auth snapshots (#85558) · openclaw/openclaw@711e963 fix(test): prepare macos runner tmpdir · openclaw/openclaw@7db4b3d test(agents): stabilize yielded exec timeout test · openclaw/openclaw@c14c043 test: stabilize media fallback and background timeout tests · openclaw/openclaw@3bb4be2 fix(whatsapp): warn once when group inbound dropped for missing chann… chore(release): refresh plugin sdk api baseline · openclaw/openclaw@e752f9b test(whatsapp): stabilize media format expectations · openclaw/openclaw@c43ed9e test(qqbot): make OPENCLAW_HOME media test Windows-safe · openclaw/openclaw@1e9b6b7 fix(test): forward installer smoke controls test: align image fast path expectations · openclaw/openclaw@21aefb8 test: align pnpm cache workflow assertion · openclaw/openclaw@c4f0682 test: enforce per-test ci threshold · openclaw/openclaw@4118a32 ci(mantis): pass crabbox capacity regions · openclaw/openclaw@4fdf617 ci: disable pnpm action cache on Windows · openclaw/openclaw@bc3d6ba fix(agents): skip wildcard catalog metadata refs (#86524) fix(test): bootstrap macos script stdin test(codex): avoid app-server diagnostic notification race fix(embedded-runner): preserve provider errors on cleanup takeover (#… · openclaw/openclaw@7fbca96 fix(agents): handle preflight compaction no-op budgets (#86709) · openclaw/openclaw@bcde7b1 fix: make QQ Bot media paths respect `OPENCLAW_HOME` configuration (#… · openclaw/openclaw@0d23c3b fix(tooling): skip gauntlet declaration prebuild fix(control-ui): support raw edits from editable config (#86726) · openclaw/openclaw@c9d0464 revert: iMessage group media attachment command (#86734) · openclaw/openclaw@5a33378 fix(release): stabilize beta validation after rebase · openclaw/openclaw@609d70d fix(test): measure kitchen sink gateway children · openclaw/openclaw@4738d0a fix(whatsapp): restore ack emoji identity fallback (#86697) · openclaw/openclaw@34d862d fix(imessage): send group media via attachment command · openclaw/openclaw@f322732 fix(test): harden plugin gauntlet proof · openclaw/openclaw@eab8d29 fix(release): stabilize beta validation after main rebase · openclaw/openclaw@9301598 refactor: use Rastermill for image processing (#86621) perf(discord): use libopus-wasm for voice opus fix(build): pin synthetic auth runtime dist entry (#86714) · openclaw/openclaw@3d06594 fix(plugin-sdk): preserve string-const unions as flat enum for deepse… · openclaw/openclaw@fddca99 fix(perf): bound session transcript stat fanout · openclaw/openclaw@2e6ba44 fix(test): bound plugin gauntlet prebuilds · openclaw/openclaw@6984a82 perf: speed up usage cost lookups · openclaw/openclaw@743bce2 Add OpenTelemetry LLM content spans (#86191) · openclaw/openclaw@f824e15 chore: remove unused tracked assets · openclaw/openclaw@592f192 fix(perf): tolerate passing filtered release gates · openclaw/openclaw@c410658 fix(release): accept optional Discord voice decoder · openclaw/openclaw@8f1f790 fix(release): stabilize beta validation tests · openclaw/openclaw@e049105
fix: dedupe replayed exec.finished node events (#67281) · openclaw/openclaw@5dcf526
2026-04-16 · via Recent Commits to openclaw:main

@@ -0,0 +1,122 @@

1+

# Async Exec Duplicate Completion Investigation

2+3+

## Scope

4+5+

- Session: `agent:main:telegram:group:-1003774691294:topic:1`

6+

- Symptom: the same async exec completion for session/run `keen-nexus` was recorded twice in LCM as user turns.

7+

- Goal: identify whether this is most likely duplicate session injection or plain outbound delivery retry.

8+9+

## Conclusion

10+11+

Most likely this is **duplicate session injection**, not a pure outbound delivery retry.

12+13+

The strongest gateway-side gap is in the **node exec completion path**:

14+15+

1. A node-side exec finish emits `exec.finished` with the full `runId`.

16+

2. Gateway `server-node-events` converts that into a system event and requests a heartbeat.

17+

3. The heartbeat run injects the drained system event block into the agent prompt.

18+

4. The embedded runner persists that prompt as a new user turn in the session transcript.

19+20+

If the same `exec.finished` reaches the gateway twice for the same `runId` for any reason (replay, reconnect duplicate, upstream resend, duplicated producer), OpenClaw currently has **no idempotency check keyed by `runId`/`contextKey`** on this path. The second copy will become a second user message with the same content.

21+22+

## Exact Code Path

23+24+

### 1. Producer: node exec completion event

25+26+

- `src/node-host/invoke.ts:340-360`

27+

- `sendExecFinishedEvent(...)` emits `node.event` with event `exec.finished`.

28+

- Payload includes `sessionKey` and full `runId`.

29+30+

### 2. Gateway event ingestion

31+32+

- `src/gateway/server-node-events.ts:574-640`

33+

- Handles `exec.finished`.

34+

- Builds text:

35+

- `Exec finished (node=..., id=<runId>, code ...)`

36+

- Enqueues it via:

37+

- `enqueueSystemEvent(text, { sessionKey, contextKey: runId ? \`exec:${runId}\` : "exec", trusted: false })`

38+

- Immediately requests a wake:

39+

- `requestHeartbeatNow(scopedHeartbeatWakeOptions(sessionKey, { reason: "exec-event" }))`

40+41+

### 3. System event dedupe weakness

42+43+

- `src/infra/system-events.ts:90-115`

44+

- `enqueueSystemEvent(...)` only suppresses **consecutive duplicate text**:

45+

- `if (entry.lastText === cleaned) return false`

46+

- It stores `contextKey`, but does **not** use `contextKey` for idempotency.

47+

- After drain, duplicate suppression resets.

48+49+

This means a replayed `exec.finished` with the same `runId` can be accepted again later, even though the code already had a stable idempotency candidate (`exec:<runId>`).

50+51+

### 4. Wake handling is not the primary duplicator

52+53+

- `src/infra/heartbeat-wake.ts:79-117`

54+

- Wakes are coalesced by `(agentId, sessionKey)`.

55+

- Duplicate wake requests for the same target collapse to one pending wake entry.

56+57+

This makes **duplicate wake handling alone** a weaker explanation than duplicate event ingestion.

58+59+

### 5. Heartbeat consumes the event and turns it into prompt input

60+61+

- `src/infra/heartbeat-runner.ts:535-574`

62+

- Preflight peeks pending system events and classifies exec-event runs.

63+

- `src/auto-reply/reply/session-system-events.ts:86-90`

64+

- `drainFormattedSystemEvents(...)` drains the queue for the session.

65+

- `src/auto-reply/reply/get-reply-run.ts:400-427`

66+

- The drained system event block is prepended into the agent prompt body.

67+68+

### 6. Transcript injection point

69+70+

- `src/agents/pi-embedded-runner/run/attempt.ts:2000-2017`

71+

- `activeSession.prompt(effectivePrompt)` submits the full prompt to the embedded PI session.

72+

- That is the point where the completion-derived prompt becomes a persisted user turn.

73+74+

So once the same system event is rebuilt into the prompt twice, duplicate LCM user messages are expected.

75+76+

## Why plain outbound delivery retry is less likely

77+78+

There is a real outbound failure path in the heartbeat runner:

79+80+

- `src/infra/heartbeat-runner.ts:1194-1242`

81+

- The reply is generated first.

82+

- Outbound delivery happens later via `deliverOutboundPayloads(...)`.

83+

- Failure there returns `{ status: "failed" }`.

84+85+

However, for the same system event queue entry, this alone is **not sufficient** to explain the duplicate user turns:

86+87+

- `src/auto-reply/reply/session-system-events.ts:86-90`

88+

- The system event queue is already drained before outbound delivery.

89+90+

So a channel send retry by itself would not recreate the exact same queued event. It could explain missing/failed external delivery, but not by itself a second identical session user message.

91+92+

## Secondary, lower-confidence possibility

93+94+

There is a full-run retry loop in the agent runner:

95+96+

- `src/auto-reply/reply/agent-runner-execution.ts:741-1473`

97+

- Certain transient failures can retry the whole run and resubmit the same `commandBody`.

98+99+

That can duplicate a persisted user prompt **within the same reply execution** if the prompt was already appended before the retry condition triggered.

100+101+

I rank this lower than duplicate `exec.finished` ingestion because:

102+103+

- the observed gap was around 51 seconds, which looks more like a second wake/turn than an in-process retry;

104+

- the report already mentions repeated message send failures, which points more toward a separate later turn than an immediate model/runtime retry.

105+106+

## Root Cause Hypothesis

107+108+

Highest-confidence hypothesis:

109+110+

- The `keen-nexus` completion came through the **node exec event path**.

111+

- The same `exec.finished` was delivered to `server-node-events` twice.

112+

- Gateway accepted both because `enqueueSystemEvent(...)` does not dedupe by `contextKey` / `runId`.

113+

- Each accepted event triggered a heartbeat and was injected as a user turn into the PI transcript.

114+115+

## Proposed Tiny Surgical Fix

116+117+

If a fix is wanted, the smallest high-value change is:

118+119+

- make exec/system-event idempotency honor `contextKey` for a short horizon, at least for exact `(sessionKey, contextKey, text)` repeats;

120+

- or add a dedicated dedupe in `server-node-events` for `exec.finished` keyed by `(sessionKey, runId, event kind)`.

121+122+

That would directly block replayed `exec.finished` duplicates before they become session turns.