惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
博客园 - 聂微东
B
Blog RSS Feed
Apple Machine Learning Research
Apple Machine Learning Research
Hugging Face - Blog
Hugging Face - Blog
博客园 - 三生石上(FineUI控件)
博客园 - Franky
小众软件
小众软件
罗磊的独立博客
G
Google Developers Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
MongoDB | Blog
MongoDB | Blog
腾讯CDC
N
Netflix TechBlog - Medium
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Security Latest
Security Latest
T
Threatpost
L
LINUX DO - 热门话题
P
Privacy & Cybersecurity Law Blog
J
Java Code Geeks
T
Threat Research - Cisco Blogs
V2EX - 技术
V2EX - 技术
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
NISL@THU
NISL@THU
M
MIT News - Artificial intelligence
Cisco Talos Blog
Cisco Talos Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
H
Heimdal Security Blog
The Last Watchdog
The Last Watchdog
量子位
P
Palo Alto Networks Blog
W
WeLiveSecurity
H
Hacker News: Front Page
Hacker News - Newest:
Hacker News - Newest: "LLM"
博客园_首页
爱范儿
爱范儿
V
Vulnerabilities – Threatpost
Engineering at Meta
Engineering at Meta
Help Net Security
Help Net Security
D
Darknet – Hacking Tools, Hacker News & Cyber Security
S
Security Affairs
云风的 BLOG
云风的 BLOG
A
About on SuperTechFans
A
Arctic Wolf
大猫的无限游戏
大猫的无限游戏
T
The Exploit Database - CXSecurity.com
Hacker News: Ask HN
Hacker News: Ask HN
C
Cisco Blogs
Jina AI
Jina AI

Show HN

CSP Radar GitHub - awebai/aweb-team-coord-worktrees: An aweb team template for a minimum team with a permanent coordinator and worktrees with local developers. GitHub - fujibee/agmsg GitHub - lucastononro/notify: 100% local, free, offline attention skill for Claude Code: plays a sound and speaks a short status update when a long task finishes, blocks, or needs a decision. GitHub - sebastianwessel/skills: AI Skills tivatdoar / workout-to-work · GitLab GitHub - enumura1/py-sql-cleaner: Find, format, and safely extract embedded SQL from Python files. GitHub - intent-bench/intent-bench: Intent fulfillment benchmark for agentic AI engineering GitHub - steveking-gh/firmion: Firmion is DSL and engine for firmware image generation. GitHub - villagesql/villagesql-skills: Agent skills for VillageSQL - gemini-cli-extension; claude-code-plugin GitHub - 0gsd/enough: a personal language system for planning, writing, and translation. GitHub - Kaelio/ktx: ktx is an executable context layer for data and analytics agents 🐙 Allow Claude Code, Codex, and any AI agent to query data accurately through MCP with skills, memory and a semantic layer GitHub - ThatXliner/xtras: Xliner's Claude Code Skills GitHub - flightdeckhq/flightdeck: Observability and control plane for AI agents. GitHub - search-router/simple-search: Open-source reference app on top of the Search Router API: FastAPI + Jinja metasearch service with pluggable backends, deterministic mocks (no API key needed), RTL UI, Redis cache, and a demo ads cabinet. CSP Radar GitHub - Light-Heart-Labs/DreamServer: Turn your PC, Mac, or Linux box into an AI server. LLM inference, chat UI, voice, agents, workflows, RAG, and image generation. GitHub - Diplomat-ai/diplomat-agent-ts: What can your TypeScript AI agent do to the real world? Scan your code. See which tool calls have zero checks Code Block Selector - Visual Studio Marketplace Prometheus dependency graph — interactive showcase | Riftmap Show HN: I made a vi-like modal keyboard plugin for Figma GitHub - run-llama/liteparse: A fast, helpful, and open-source document parser GitHub - dalemyers/Roar: A macOS CLI tool for notifications GitHub - district-solutions/open-agent-tools-coder: Enables small-to-large self-hosted ai models to use local source code when running tool-calling agentic workloads. We actively data mine 20,900+ (2+ TB) popular github repos using large and small ai models to create reuseable: json, markdown and parquet files for local-first tool-calling models. GitHub - progapandist/stripeek: A local TUI proxy for real-time Stripe API debugging, built for navigating complex payloads fast. GitHub - sir1st/hermes-desktop: All-in-one cross-platform desktop app for Hermes Agent — bundles Python + hermes-agent + hermes-web-ui GitHub - astefanutti/shaderbang: Shebang for Shaders Show HN: Generate Claude Code Workflows using Spec Driven Development approach GitHub - nixys/nxs-universal-chart: The Helm chart you can use to install any of your applications into Kubernetes/OpenShift Show HN: AI agents for UK GDAD PCF roles and their skills The Two Pillars: Mixer Mode and Meta-Software in the Reorganization of Software Work After AI GitHub - JaiCode08/teleport-env What 1,000+ Harness Experiments Taught Me About Self-Improving Agents Show HN: Liiists, a Markdown-first, iOS and CLI list app SwiperTab – Get this Extension for 🦊 Firefox (en-US) GitHub - kouhxp/fftext: Summarize, explain, fact-check, or translate any text, URL, or file. No GPU. No cloud. One command GitHub - sweetpad-dev/sweetpad: Develop Swift/iOS projects using VSCode GitHub - dogmaticdev/IRON: IRON a.k.a. Intermediate Representation Object Notation is a Interpreter/Database that is used to create Programming Languages. GitHub - sjhalani7/vaen: Package your AI coding harness into a portable .agent file, and share it across repos, teams, & the community without ever having to copy-paste instructions, skills, MCP config, or secrets. Show HN: Gandalf the Grader Show HN: Citadeld – replay any CI failure locally from a single file GitHub - tdortman/cuSBF: High-Performance GPU Super Bloom Filter coral-ai/claude-code-token-xray at main · Coral-Bricks-AI/coral-ai GitHub - ulyssestenn/funes: Funes is a Git-based framework for LLM-managed knowledge work: an AI Librarian ingests raw sources, builds an interlinked Markdown knowledge base, and uses it to produce cited reports, analyses, and other outputs. GitHub - ThatXliner/gah: Git Add Hunk, built for agents to use GitHub - harmont-dev/harmont-cli: Command-line client for the Harmont CI platform GitHub - brooksmcmillin/mcp-authflow: OAuth 2.0 Authorization Server framework for MCP servers GitHub - javaid-codes/audit-supply-chain-agents GitHub - amorey/gochan: A small library of common channel architectures for Go, inspired by Rust GitHub - arifozgun/OpenGem: Free, Open-Source AI API Gateway with Gemini, OpenAI & Anthropic Compatibility in 1 file GitHub - Pranesh950/BioPetals: 🌸 Run BIOxAI models at home, BitTorrent-style. Fine-tuning and inference up to 10x faster than offloading GitHub - cnguyen14/bounty-doctor: Diagnose a GitHub bounty issue before you waste hours: detects honeypot scam repos, AI-bot attempt swarms, and stale contests. Show HN: CoreMCP – MCP Server for On-Prem DBs Show HN: KittyHTML – Render HTML/CSS as an inline image in your terminal GitHub - bingud/filemat: Web-based file manager Show HN: TruthLens – Free multi-signal deepfake image detector GitHub - apexlocal-jz/claude-usage-tray: Windows system-tray app showing your Claude Code rate-limit usage at a glance. Zero deps, ~300 lines of PowerShell. Cross-IDE (works regardless of VS Code, Cursor, plain terminal). Release v0.1.2.1 · kouhxp/yapsnap GitHub - noopolis/moltnet: Self-hostable chat network for AI agents. Pre-built bridges for Claude Code, Codex, and the Claws. Rooms, DMs, history. No Slack bots, no Matrix, no glue code. GitHub - tamerh/enju: Coordinating Humans, AI Agents, and Compute as Peers on a Shared Workflow Graph Show HN: Continuity-auth – Respect-weighted rate limits for the open web GitHub - luml-ai/luml: AI lifecycle platform where engineers and agents track experiments, train models, and ship to production. GitHub - mrdanielcasper/CoreTex: A UNIX-inspired, biomimetic, flat-file AI harness and knowledge engine. GitHub - clemg/pierre-github: Pierre's diffs.com and trees.software for Github GitHub - lyriks-io/unspaghettit: Behavior-driven AI development without prompt spaghetti. GitHub - sofumel/claude-handoff-revive: Resume Claude Code work after rate/usage/context limits without replaying the prior transcript. Auto-saves at 90%/95% usage. Plugin-installable, 10 languages. GitHub - dotexorg/saferpc: Typed, end-to-end encrypted RPC over any bidirectional channel. GitHub - BeeZeeAgent/beezee: Agent harness orchestration Legato Next.js Boilerplate for Internal Tools · CoreUI GitHub - clark-labs-inc/clark-hash: Clark Hash, 32x smaller searchable sketches for embeddings GitHub - ZeroPointRepo/youtube-mcp: The fastest YouTube transcript + YouTube search MCP for AI agents. Try for free. Typing Mastery — climb toward 100+ WPM, deliberately GitHub - Andebugulin/Awareen GitHub - fayzan123/claude-workflow-composer: Visual desktop app for composing multi-agent coding workflows. Drag agents, attach skills and MCPs, wire handoffs, export to .claude/ GitHub - StackOneHQ/stack-nudge We hardened an LLM agent. Each defense we added made it more exploitable. GitHub - alkait/WhatsKept: Agent-queryable WhatsApp history from an iOS backup — a single Go binary. GitHub - octelium/cordium: Open-source, general-purpose sandbox platform for devs and AI agents that provides identity-based secure access to infrastructure without credentials. GitHub - scosman/videowright: Build animated explainer videos with your coding agent GitHub - dipankar/dscode: The code editor you can take apart. GitHub - zoharbabin/web-researcher-mcp: MCP server (Go) for AI assistants: web search, content extraction, academic/patent/news research. Multi-provider routing, 4-tier scraping, search lenses. Works with Claude, Cursor, and any MCP client. GitHub - scanaislop/aislop: Catch the slop AI coding agents leave in your code: narrative comments, swallowed exceptions, as-any casts, dead code, oversized functions. 50+ rules across 7 languages (TypeScript, JavaScript, Python, Go, Rust, Ruby, PHP). Sub-second, deterministic, no LLM at runtime. MIT-licensed. GitHub - kouhxp/cheap-im: CPU-only voice agent approximating Thinking Machines' Interaction Models demo GitHub - unprovable/OrchidMantis: Orchid Mantis — standalone framework for Zero-Knowledge Proofs of eXploit (ZKPoX). GitHub - TangibleResearch/Halgorithem: A Algo designed to detect AI Hallucitions GitHub - CarpseDeam/Aura-IDE: An AI coding harness that shaped itself - Planner/Worker agents, repo awareness, surgical edits, validation, recovery, and safe diff approvals. GitHub - chojs23/concord: A feature-rich TUI client for Discord GitHub - aerf-spec/aerf: Agent Evidence Receipt Format (AERF) — an open specification for tamper-evident, independently verifiable records of AI agent actions. GitHub - Jwrede/tokentoll: Catch LLM cost changes in code review. Infracost for LLM spend. GitHub - samchon/ttsc: A `typescript-go` toolchain for compiler-powered plugins and type-safe execution + 500x faster lint integrated into compiler GitHub - Higangssh/homebutler: 🏠 Manage your homelab from chat. Single binary, zero dependencies. GitHub - olalie/tapmap: See where your computer connects and what stands out on a live world map. GitHub - Diplomat-ai/diplomat-agent: What can your AI agent do to the real world? Scan your code. See which tool calls have zero checks GitHub - Bajusz15/beacon: Open-source agent for secure remote access, monitoring, and deploys across home-lab and self-hosted machines like Raspberry Pi, N100, or any Linux server. Open web based TTY or tunnel Home Assistant and other local services securely without opening ports. BigTech AI News - Chrome 应用商店 GitHub - vinhnx/VTCode: VT Code is an open-source coding agent with LLM-native code understanding and robust shell safety. Supports multiple LLM providers with automatic failover and efficient context management. GitHub - Lumen-Labs/brainapi2: BrainAPI is a knowledge graph–powered AI memory layer that transforms unstructured data into structured knowledge, enabling intelligent search, recommendations, and contextual memory for AI agents and applications. GitHub - familiar-software/familiar: Let AI watch you work. Familiar lets your AI update its memory, skills, and knowledge by watching your screen. make sidebar/address bar rounded corner toggleable
claude-code-sre-handbook/watcher-semantic-example at main · har-ki/claude-code-sre-handbook
har-ki · 2026-06-16 · via Show HN

Semantic-memory variant of the SRE incident watcher. Adds similarity-based retrieval (via local embeddings) and validation discipline as toggleable layers on top of the file-based memory convention from watcher-memory-example/.

Self-contained. Does not import from or modify the other watcher examples.

Embedding model

  • Model: nomic-embed-text (274 MB, 768-dimensional)
  • Runtime: Ollama (local, air-gapped compatible)
  • Similarity scores are model-specific. Threshold values calibrated for this model are not transferable to other embedding models.

Three config knobs

All set in alert-watcher/config.yaml, no code edits needed:

# Retrieval mode: exact (SHA256-hash lookup) or similarity (cosine search)
retrieval_mode: similarity    # exact | similarity

# Minimum cosine similarity to surface a finding (similarity mode only)
similarity_threshold: 0.75

# When true, Phase 1 must classify the prior finding as
# validated / invalidated / inconclusive before using it
validation_discipline: true   # true | false

Architecture

watcher-semantic-example/
├── alert-watcher/
│   ├── main.py               # Orchestrator with configurable retrieval
│   ├── semantic_memory.py     # Embedding, indexing, retrieval layer
│   └── config.yaml            # Three toggles + standard config
├── claude-runner/
│   ├── invoke.sh              # Two separate claude -p phases
│   └── workspace-claude.md    # Model orientation (includes memory docs)
├── prompts/
│   ├── 01-investigate.md      # Phase 1: retrieval metadata + validation discipline
│   └── 02-propose-fix.md      # Phase 2: fix + required memory write
├── memory-store/
│   ├── incidents/             # Per-incident findings (<sha8>.md)
│   └── embeddings/            # Vector index (index.json)
├── test-incidents/            # Three test cases (see below)
├── seed-memory.sh             # Seeds memory + builds vector index
└── verify-retrieval.sh        # Verifies retrieval against all test cases

Locked invariants (same as watcher-memory-example)

  • Two separate claude -p phases (never collapsed)
  • --max-turns 40, --output-format stream-json --verbose
  • gh pr ready absent from all --allowedTools (human gate)
  • Memory paths validated via regex + realpath() containment

Retrieval modes

Exact (baseline)

Same as watcher-memory-example/. Looks up memory-store/incidents/<sha8>.md by the fingerprint hash. Hits only on identical service|exception_class.

Similarity

Embeds the incident description via nomic-embed-text, searches the vector index by cosine similarity. Returns the nearest finding if the score exceeds similarity_threshold; stays silent otherwise (safe-silence default).

Index format: Flat JSON file at memory-store/embeddings/index.json. Each entry stores the embedding, service, exception class, and fingerprint hash. Sufficient for hundreds of findings (not designed for millions).

Embedding strategy:

  • Stored findings use search_document: prefix with finding content (rich)
  • Queries use search_query: prefix with expanded exception context
  • CamelCase exception names are split for semantic signal (StockMismatchErrorstock mismatch error)

Validation discipline

When validation_discipline: true, Phase 1's prompt requires an explicit classification before using the prior finding:

> **Validation: VALIDATED** — prior finding matches because ...
> **Validation: INVALIDATED** — prior finding does NOT match because ...
> **Validation: INCONCLUSIVE** — cannot confirm, investigating further

If INVALIDATED, the agent must investigate from scratch. This is designed to catch the "recall-anchoring" failure mode where a partial match causes the agent to stop investigating too early.

Test incidents

Three test cases in test-incidents/:

Test Fingerprint Exact hash What it tests
Canonical ecommerce-api|StockMismatchError e2836e74 Baseline — should find and validate
Drift ecommerce-api|InventoryValidationError de61efe2 Renamed exception. Exact misses, similarity finds.
Near-miss ecommerce-api|StockMismatchError e2836e74 Same hash, different cause (cache bug). Both modes find it — tests whether validation discipline catches the mismatch.
No-precedent ecommerce-api|PaymentGatewayTimeout 2c8fc300 No relevant finding. Both modes should stay silent.

Verified retrieval scores (nomic-embed-text)

Test Exact Similarity score Above 0.75?
Canonical hit 0.80 yes
Drift miss 0.81 yes
Near-miss hit 0.80 yes
No-precedent miss 0.74 no (silent)

Quick start

# 1. Ensure Ollama is running with the embedding model
ollama pull nomic-embed-text

# 2. Seed memory with the canonical finding + build vector index
bash seed-memory.sh

# 3. Verify retrieval works for all test incidents
bash verify-retrieval.sh

# 4. Run the watcher (requires otel-demo cluster + Docker)
cp .env.example .env
# Fill in ANTHROPIC_API_KEY and GH_TOKEN
docker compose up -d --build

Switching modes between runs

Edit alert-watcher/config.yaml:

# Run 1: exact-hash baseline
retrieval_mode: exact
validation_discipline: false

# Run 2: similarity with discipline
retrieval_mode: similarity
validation_discipline: true

# Run 3: similarity without discipline (to see if discipline matters)
retrieval_mode: similarity
validation_discipline: false

No code changes, no rebuild. The config is read at watcher startup.