惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Simon Willison's Weblog
Simon Willison's Weblog
L
LangChain Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
S
SegmentFault 最新的问题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
博客园 - 【当耐特】
Hugging Face - Blog
Hugging Face - Blog
小众软件
小众软件
T
Tailwind CSS Blog
IT之家
IT之家
WordPress大学
WordPress大学
The Cloudflare Blog
大猫的无限游戏
大猫的无限游戏
W
WeLiveSecurity
阮一峰的网络日志
阮一峰的网络日志
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Jina AI
Jina AI
C
Cyber Attacks, Cyber Crime and Cyber Security
美团技术团队
Hacker News - Newest:
Hacker News - Newest: "LLM"
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
A
Arctic Wolf
量子位
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
博客园 - Franky
李成银的技术随笔
C
CXSECURITY Database RSS Feed - CXSecurity.com
酷 壳 – CoolShell
酷 壳 – CoolShell
Schneier on Security
Schneier on Security
博客园 - 聂微东
博客园 - 司徒正美
宝玉的分享
宝玉的分享
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
V
V2EX
www.infosecurity-magazine.com
www.infosecurity-magazine.com
P
Palo Alto Networks Blog
雷峰网
雷峰网
J
Java Code Geeks
博客园 - 叶小钗
F
Full Disclosure
博客园 - 三生石上(FineUI控件)
O
OpenAI News
T
Tenable Blog
C
Cisco Blogs
人人都是产品经理
人人都是产品经理
S
Securelist
NISL@THU
NISL@THU

Hacker News: Show HN

Show HN: Free Tool to Analyze Content Security Policies Code Block Selector - Visual Studio Marketplace AI Philosopher King Show HN: Lattice – place attractors, ignite stars, outlast three AIs Prometheus dependency graph — interactive showcase | Riftmap Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue Show HN: I made a vi-like modal keyboard plugin for Figma flashcardaudit.com | flashcardaudit.com Quickstart Show HN: Pushing a naive C++ web server implementation to 9k req/SEC GitHub - run-llama/liteparse: A fast, helpful, and open-source document parser Removals AI for movers — instant photo quotes for UK removal firms Keptour | AI travel itinerary planner GitHub - dalemyers/Roar: A macOS CLI tool for notifications GitHub - district-solutions/open-agent-tools-coder: Enables small-to-large self-hosted ai models to use local source code when running tool-calling agentic workloads. We actively data mine 20,900+ (2+ TB) popular github repos using large and small ai models to create reuseable: json, markdown and parquet files for local-first tool-calling models. GitHub - progapandist/stripeek: A local TUI proxy for real-time Stripe API debugging, built for navigating complex payloads fast. GitHub - sir1st/hermes-desktop: All-in-one cross-platform desktop app for Hermes Agent — bundles Python + hermes-agent + hermes-web-ui The Anatomy of an LLM GitHub - astefanutti/shaderbang: Shebang for Shaders MapYourGrid Show HN: Generate Claude Code Workflows using Spec Driven Development approach Show HN: AT4K Launcher - Apple TV inspired Launcher for Android TVs GitHub - nixys/nxs-universal-chart: The Helm chart you can use to install any of your applications into Kubernetes/OpenShift Show HN: Reassign – a 24-hour dial for planning your day AudaStories: Hear the story of anything Sam's POV restaurant visits and updates Product Hunt Launch Network — LaunchPact Show HN: AI agents for UK GDAD PCF roles and their skills TopRec (toprec.io) – AI screening and CRM for recruiters and hiring teams Show HN: Camera Lets You See Sound in Complete Darkness [video] BetterCallClaude Italia — AI legale per avvocati Show HN: Product Trailers – The TV Channel for Product Hunt Launches TickerDB - Market context for agents The Two Pillars: Mixer Mode and Meta-Software in the Reorganization of Software Work After AI GitHub - JaiCode08/teleport-env What 1,000+ Harness Experiments Taught Me About Self-Improving Agents Show HN: Verbum Vitae – Bible memorization [pt] Freeciv Longturn Show HN: Liiists, a Markdown-first, iOS and CLI list app SwiperTab – Get this Extension for 🦊 Firefox (en-US) Show HN: My Attempt at QR as a Captcha Stop the Scroll in Under 2 Seconds | StopClip GitHub - kouhxp/fftext: Summarize, explain, fact-check, or translate any text, URL, or file. No GPU. No cloud. One command Show HN: CSP Radar – Build a Content Security Policy without breaking your site GitHub - sweetpad-dev/sweetpad: Develop Swift/iOS projects using VSCode GitHub - dogmaticdev/IRON: IRON a.k.a. Intermediate Representation Object Notation is a Interpreter/Database that is used to create Programming Languages. GitHub - sjhalani7/vaen: Package your AI coding harness into a portable .agent file, and share it across repos, teams, & the community without ever having to copy-paste instructions, skills, MCP config, or secrets. Gonfire · See how candidates actually work with AI. Show HN: Open-source tool for learning anything using AI Show HN: Open-Source AI Racing Harness tir · race the room to the target word Dataforge Honeypot Remove Audio from Video Online Free - Runs Locally, No Watermarks Show HN: Gandalf the Grader Show HN: Epstein Index – Stock returns of Epstein-linked companies since 2008 Show HN: Citadeld – replay any CI failure locally from a single file GitHub - tdortman/cuSBF: High-Performance GPU Bloom Filter coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai Show HN: GTFS·X – a free, web-based transit schedule (GTFS) editor World Flavor Atlas — Epicure Visualization GitHub - ulyssestenn/funes: Funes is a Git-based framework for LLM-managed knowledge work: an AI Librarian ingests raw sources, builds an interlinked Markdown knowledge base, and uses it to produce cited reports, analyses, and other outputs. GitHub - ThatXliner/gah: Git Add Hunk, built for agents to use Show HN: 10k+ Funny Quotes We hardened an LLM agent. Each defense we added made it more exploitable. GitHub - harmont-dev/harmont-cli: Command-line client for the Harmont CI platform Architectural Metapatterns GitHub - brooksmcmillin/mcp-authflow: OAuth 2.0 Authorization Server framework for MCP servers Show HN: VimRace Hodor — Instantly launch your prompts into any AI tool GitHub - javaid-codes/audit-supply-chain-agents Workplane — Share AI artifacts with humans and agents DEMON: Diffusion Engine for Musical Orchestrated Noise Show HN: Gochan – A library of channel architectures for Go, inspired by Rust Show HN: WatchPlane, my attempt to replace my monitoring tool stack GitHub - arifozgun/OpenGem: Free, Open-Source AI API Gateway with Gemini, OpenAI & Anthropic Compatibility GitHub - OSbiotools/BioPetals: 🌸 Run BIOxAI models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading GitHub - cnguyen14/bounty-doctor: Diagnose a GitHub bounty issue before you waste hours: detects honeypot scam repos, AI-bot attempt swarms, and stale contests. Show HN: Approve Claude CLI prompts from the browser, phone, or tablet GridPath — Best way to build spreadsheets with AI Kibbutznik — a pulse-based direct democratic engine Show HN: CoreMCP – MCP Server for On-Prem DBs Zorilla — vibe code games with your crew Show HN: KittyHTML – Render HTML/CSS as an inline image in your terminal Show HN: Enigma – a walkthrough from Caesar ciphers to a working Enigma machine GitHub - bingud/filemat: Web-based file manager Show HN: TruthLens – Free multi-signal deepfake image detector GitHub - apexlocal-jz/claude-usage-tray: Windows system-tray app showing your Claude Code rate-limit usage at a glance. Zero deps, ~300 lines of PowerShell. Cross-IDE (works regardless of VS Code, Cursor, plain terminal). Show HN: I made an emergency page for my family. You should too Mneme HQ — Architectural Governance for AI-Assisted Development 2048 — Blitz Edition Release v0.1.2.1 · kouhxp/yapsnap GitHub - noopolis/moltnet: Self-hostable chat network for AI agents. Pre-built bridges for Claude Code, Codex, and the Claws. Rooms, DMs, history. No Slack bots, no Matrix, no glue code. Show HN: Disable Ugly Firefox Single Rounded Corner Show HN: Enju – humans, AI agents, and compute as peers on one workflow graph PolyCSS - CSS 3D Engine for the DOM Show HN: Continuity-auth – Respect-weighted rate limits for the open web GitHub - luml-ai/luml: LUML is an open-source MLOps/LLMOps platform, allowing to build and deploy AI/ML models in a matter of minutes. Show HN: Sitchy – Auto-setup any GitHub repo Show HN: Detect anti-bot, anti-agent defenses for any website InsiderTrack · Insider Trading Intelligence
GitHub - Diplomat-ai/diplomat-agent-ts: What can your TypeScript AI agent do to the real world? Scan your code. See which tool calls have zero checks
jguarnelli · 2026-05-28 · via Hacker News: Show HN

Discussions · Contributing · Changelog · Issues

npm version CI npm downloads Node 20+ License: Apache 2.0 OWASP Agentic

You shipped a TypeScript AI agent. Do you know every function it can call that writes to a database, sends an email, charges a card, or deletes data — and which ones have zero checks?

diplomat-agent-ts runs a static AST scan and tells you exactly that. Two dependencies. ~9 s on a 7,874-file TypeScript agent codebase (OpenClaw, M-series). ~30 s on slower x86 hardware without a tsconfig.json.

npm install -D @diplomat-ai/diplomat-agent-ts
npx diplomat-agent-ts scan .        # scan from project root
npx diplomat-agent-ts scan ./src    # or a specific subdirectory

Before diplomat-agent-ts: you can't see what your agent can do. After: every tool call is mapped, classified by side effect, and tagged with OWASP Agentic codes — diffable in PRs.

What it looks like

diplomat-agent-ts terminal output showing one unguarded chargeCustomer call with missing bounds, rate limit, and approval step (mapped to OWASP ASI-01, ASI-02, ASI-03, ASI-06), one unguarded deleteUserData call, and one confirmed sendWelcomeEmail call annotated with checked:ok.

Why this matters for AI agents

In a web app, a human clicks a button. The UI has validation, confirmation dialogs, rate limits per session.

In an agent, an LLM decides which functions to call, with what arguments, how many times. It doesn't know your business rules. It can loop, hallucinate arguments, or get prompt-injected.

Without guards in the code, there's nothing between the LLM's decision and the real-world consequence.

We scanned the OpenClaw agent codebase at pinned commit 49d9996d (7,874 TypeScript files, ~9 s on M-series, ~30 s on x86). 419 tool calls had real side effects. 332 of them (79%) had zero checks. Not a single one was confirmed.

What it detects

40+ patterns across 12 side-effect categories:

Category Examples (TS-native) Required guards
payment stripe.charges.create(), stripe.refunds.create() bounds, rate limit, approval
database_write Prisma .create() / .update(), Mongoose .save() input validation, rate limit
database_delete Prisma .delete(), Mongoose .deleteOne(), raw DELETE batch protection, confirmation
http_write axios.post(), fetch(POST), got.put() rate limit, retry bound
email nodemailer.sendMail(), resend.emails.send(), sgMail.send() rate limit
messaging twilio.messages.create(), slack.chat.postMessage() rate limit
agent_invocation agent.run(), graph.invoke(), Runner.run() input validation, approval
llm_call openai.chat.completions.create(), anthropic.messages.create()
publish s3.send(PutObjectCommand), client.publish() approval
dynamic_code eval(), new Function(), vm.runInNewContext() confirmation
file_delete fs.rm(), fs.unlink(), fs-extra.remove() confirmation
destructive execSync(), spawnSync(), execa() confirmation

What counts as a guard: input validation (Zod, Yup, class-validator), rate limiting (NestJS @Throttle, custom decorators), auth checks (NestJS guards, middleware), confirmation steps, idempotency keys, retry bounds. Full catalog in src/scanner/patterns.ts.

Quick start

# Scan from your project root (default: current directory)
diplomat-agent-ts scan .

# Or a specific subdirectory
diplomat-agent-ts scan ./src
diplomat-agent-ts scan ./packages

# Generate the toolcalls.yaml SBOM (commit this)
diplomat-agent-ts scan . --output-registry toolcalls.yaml

# Fail CI when new unguarded tool calls appear
diplomat-agent-ts scan . --fail-on-unchecked

# JSON output for IDE agents, automation, custom dashboards
diplomat-agent-ts scan . --format json

The scanner emits:

  • A coloured ANSI report to stdout (default)
  • toolcalls.yaml — a diff-stable registry of every detected tool call (with --output-registry)
  • JSON — snake_case field names, interoperable with the Python scanner (--format json)

Integrate everywhere

CI — block unguarded PRs

# .github/workflows/diplomat.yml
- name: Diplomat governance scan
  run: npx -y @diplomat-ai/diplomat-agent-ts scan . --fail-on-unchecked

Exit code 1 if any tool call has no_checks status. Exit 0 otherwise — even if partial_checks exist (they're warnings, not blockers).

Pre-commit hook

# .pre-commit-config.yaml
repos:
  - repo: local
    hooks:
      - id: diplomat-agent-ts
        name: diplomat governance scan
        entry: npx -y @diplomat-ai/diplomat-agent-ts scan . --fail-on-unchecked
        language: system
        pass_filenames: false

IDE — review what the copilot wrote

The scanner runs locally in under 10 seconds on typical agent codebases. Ask Claude Code, Copilot, or Cursor to run it after generating tool-calling code:

"Run diplomat-agent-ts scan . and fix any unguarded tool calls."

Note: AI agents (Claude Code, Copilot, Cursor) may summarize scan output inaccurately when the result is long. Always read the raw stdout — or pipe to a file with --output report.txt and read that.

Acknowledge a tool call

When a function is intentionally unguarded or protected outside the static-analysis scope, mark it inline:

// checked:ok — protected by middleware/approval.ts
export async function chargeCustomer(amount: number, customerId: string) {
  return stripe.charges.create({ amount, currency: "usd", customer: customerId });
}

diplomat:ok and canary:ok are accepted as aliases. The next scan moves the call to confirmed status with an empty missing_hints list. The annotation appears in the YAML registry so reviewers can audit why something was confirmed.

toolcalls.yaml — a behavioral SBOM

Like package-lock.json, but for what your agent can do, not what it depends on:

spec_version: "1.0"
language: typescript

summary:
  total: 12
  no_checks: 8
  partial_checks: 3
  confirmed: 1

tool_calls:
  - function: chargeCustomer
    file: src/payments.ts
    line: 42
    actions:
      - "return stripe.charges.create({ amount, currency, customer })"
    checks: []
    missing:
      - no bounds on amount
      - no rate limit
      - no idempotency key
    owasp: [ASI-01, ASI-02, ASI-03, ASI-06]

Commit it. Diff it in PRs. When your agent gains a new capability, the change shows up in review — before it ships.

Spec → docs/toolcalls-yaml-spec.md

OWASP Agentic Top 10 mapping

Each finding is tagged with one or more relevant codes from the OWASP Agentic Security Initiative Top 10. The v0.1.0 catalog covers the codes most directly tied to static side-effect detection (ASI-01, ASI-02, ASI-03, ASI-04, ASI-05, ASI-06, ASI-10). Codes that require runtime context (ASI-07 supply chain, ASI-08 misalignment, ASI-09 deception) are out of scope for static analysis — they are covered by diplomat-gate and diplomat.run at runtime.

Code Risk When it fires
ASI-01 Excessive Agency Side effect with no auth check
ASI-02 Tool Misuse Any side effect (baseline tag)
ASI-03 Privilege Compromise Payments, deletes, destructive ops with no confirmation
ASI-04 Resource Overload Agent invocations without bounds
ASI-05 Cascading Hallucination LLM call chained to side effect
ASI-06 Identity Spoofing Missing rate limit or retry bound
ASI-10 Overreliance Nested agent invocations

Full mapping in src/analyzer/owasp.ts.

Architecture

Internal architecture of diplomat-agent-ts in three lanes: SCANNER (patterns.ts, matcher.ts, ast-scanner.ts producing Tool[]), ANALYZER (checks.ts, owasp.ts producing ScanResult), REPORTER (registry.ts, json.ts, terminal.ts). Contribution hints at the bottom: add patterns in patterns.ts only, change matching logic in matcher.ts with tests, new output formats as new files in reporter/.

Three pure stages, no shared state. See CONTRIBUTING.md for full guidelines.

Benchmarks

Real codebases, real numbers.

Methodology. File counts are the number of .ts / .tsx files actually scanned after the scanner's built-in exclusions (node_modules/, dist/, build/, *.test.ts, *.spec.ts, *.d.ts). They can differ from a raw git ls-files count by a few percent. Runs are pinned to the commits below. Findings counts (tool_calls, no_checks, partial) reproduce exactly at those commits; raw file totals on main will drift over time.

Codebase (scope) Type TS files scanned Tool calls no_checks partial Pinned commit
OpenClaw (src/) Application 7,874 419 332 (79%) 87 49d9996d
Mastra (packages/) Framework 2,777 185 162 (88%) 23 38b87964
OpenAI Agents JS (packages/) Framework 426 33 31 (94%) 2 629d35af
OpenAI Agents JS (examples/) Examples 302 32 28 (88%) 4 629d35af

Run the benchmarks yourself:

# Application — OpenClaw (pinned commit for reproducibility)
git clone https://github.com/openclaw/openclaw /tmp/openclaw
cd /tmp/openclaw && git checkout 49d9996d && cd -
npx diplomat-agent-ts scan /tmp/openclaw/src

# Framework — Mastra
git clone --depth 1 https://github.com/mastra-ai/mastra /tmp/mastra
npx diplomat-agent-ts scan /tmp/mastra/packages

# Framework + Examples — OpenAI Agents JS
git clone --depth 1 https://github.com/openai/openai-agents-js /tmp/openai-agents
npx diplomat-agent-ts scan /tmp/openai-agents/packages
npx diplomat-agent-ts scan /tmp/openai-agents/examples

Output formats

Format Flag Use case
Terminal (default) Human review
JSON --format json IDE agents, dashboards, automation
YAML registry --format registry or --output-registry FILE toolcalls.yaml SBOM, PR diffs

Known limitations

  • Static analysis only — no runtime detection. If a guard is added by middleware or a gateway outside the file, annotate with // checked:ok — protected by [where].
  • Intra-procedural — guard detection looks at the same function or its immediate decorators. Cross-file guard chains require an annotation.
  • TypeScript files only.ts and .tsx files are scanned. Plain .js files are skipped silently. For Python agents, use diplomat-agent. The scanner emits a warning if no .ts files are found in the target directory.
  • ORM patterns require the import — Mongoose, Sequelize, and TypeORM use generic method names (.save(), .create()), so the patterns are scoped to files that import the ORM. Re-exported models may be missed.
  • Abstraction layers — if a repo wraps its ORM or HTTP client behind a custom module (e.g. db.ts re-exporting Prisma without a direct import 'prisma'), call sites in consumers won't carry the importContains scope and may be missed. Use // checked:ok at the wrapper boundary.
  • Large repos without tsconfig — scanning 5,000+ files without a tsconfig.json can take 30–60 s on slower machines (9 s on M-series for 7,874 files). Point at a subdirectory (scan ./src) to reduce scope.

Full limitations and pattern refinement history → docs/limitations.md

Roadmap

  • AST scanner with 40+ patterns across 12 categories
  • toolcalls.yaml behavioral SBOM with diff-stable output
  • OWASP Agentic Top 10 mapping
  • CI integration (--fail-on-unchecked)
  • // checked:ok annotations (with diplomat:ok / canary:ok aliases)
  • Validated against OpenClaw (7,874 files, ~9 s on M-series / ~30 s on x86, pinned commit 49d9996d)
  • Inter-procedural decorator resolution (v0.2)
  • SARIF 2.1.0 output (v0.2)
  • --diff-only for changed files (v0.2)
  • MCP server scanning (v0.3)
  • VS Code extension with inline diagnostics

Requirements

  • Node.js ≥ 20
  • 2 runtime dependencies: ts-morph (TypeScript compiler wrapper), yaml

Sibling projects

Community & support

Contributing

PRs welcome. The architecture above tells you exactly where things live. Read CONTRIBUTING.md first — it explains the "patterns are data, not logic" rule that keeps the matcher simple.

License

Apache 2.0 — Copyright 2026 Diplomat Services SAS. See LICENSE.