惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

aimingoo的专栏
aimingoo的专栏
量子位
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Schneier on Security
Cisco Talos Blog
Cisco Talos Blog
T
ThreatConnect
J
Java Code Geeks
博客园 - 司徒正美
A
Arctic Wolf
T
True Tiger Recordings
C
Cybersecurity and Infrastructure Security Agency CISA
Cyberwarzone
Cyberwarzone
Know Your Adversary
Know Your Adversary
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
Recorded Future
Recorded Future
P
Palo Alto Networks Blog
The Hacker News
The Hacker News
The Register - Security
The Register - Security
S
Securelist
www.infosecurity-magazine.com
www.infosecurity-magazine.com
C
CXSECURITY Database RSS Feed - CXSecurity.com
Application and Cybersecurity Blog
Application and Cybersecurity Blog
I
Intezer
P
Privacy & Cybersecurity Law Blog
Scott Helme
Scott Helme
K
Kaspersky official blog
博客园 - 聂微东
Last Week in AI
Last Week in AI
V
V2EX
小众软件
小众软件
F
Fox-IT International blog
Martin Fowler
Martin Fowler
Apple Machine Learning Research
Apple Machine Learning Research
T
Tenable Blog
F
Future of Privacy Forum
Microsoft Security Blog
Microsoft Security Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
C
Check Point Blog
阮一峰的网络日志
阮一峰的网络日志
GbyAI
GbyAI
T
Threatpost
I
InfoQ
P
Proofpoint News Feed
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
Tor Project blog
G
GRAHAM CLULEY
D
DataBreaches.Net

Hacker News - Newest: "AI"

Deconstructing Cognitive Overload: Deep Self-Understanding Ubers COO says its getting harder to justify the money spent on AI tokenmaxxing Protect your Mac from invisible AI tool behaviour now Is AI flattening your team’s creativity? Here’s how to tell. Feynman - AI research assistant SynapCores — the AI-native database GitHub - Noumenon-ai/AutoMaxFix: Controlled AI repair loop. Audit → Reproduce → Patch → Test → Report. Safety boundaries most AI agents skip. Show HN: Hackobar – One feed for AI news GitHub - agentpatterns-ai/website: Website content for agentpatterns.ai Torvalds Tightens Linux Kernel Rules to Reject Deluge of Low-Value AI Fixes Anthropic's Olah says AI must be guided from outside Big Tech How to get your team past the AI coding plateau The Stepford AI PhoneDiffusion App - App Store Anthropic Billionaire Cofounder Joins Pope Leo, Warns AI Job Losses Will Spark "Moral Imperative Of Historic Proportions" GitHub - kian9375/seoclaw-by-kb-software: Open source AI SEO optimizer CLI — made by KianBot.ai Credential Brokering for AI Agents, Explained | Infisial Linus Torvalds Is Unhappy About the AI Influence in Linux Kernel Development Plain Markdown | Webpage to Markdown Browser Extension Grappling with AI Margin Points - Arnold Engel GrillKit – self-hosted AI technical interview trainer with voice Pope Leo’s Unsettling Vision of the AI Future One Endpoint. Zero Credentials. Eight Confirmed Vulnerabilities. Repolog — SEO, Performance, Security & AI Readiness audits An AI-generated film premiered at Cannes The uncritical adoption of AI in science is alarming — we urgently need guard rails Microsoft just banned its own engineers from using AI twitter.com GitHub - sovseal/core: Zero-Knowledge memory for AI Agents Not All On-Device AI Is The Same: How Chip Compute Tiers Decide What Your Product Can Actually Do – Easelink Tech RCF Protocol – license layer to protect code semantics from AI replication Pope Leo XIV says AI must serve humanity, not the powerful few Do you review AI generated code differently based on where it is in your code? Amazon launches new AI Wearable "Bee" bilibili Ask HN: Do you embrace AI in your life and business? Mnemosyne — The Zero-Dependency AI Memory System 21 Free Agentic AI Design Patterns for Developers (2026) Silicon Valley takes its AI pitch to the pope How to scan for vulnerabilities with GitHub Security Lab’s open source AI-powered framework AI Model Idle · 인공지능 키우기 @levelsio (@levelsio) America's plutonium puzzle: from cold war relics to AI ambitions AI can chart a course to disaster faster than humans can notice Final Fantasy Creator Call AI-Generated Final Fantasy 6 Remake Video 'Amazing' Pope Leo Compares AI Threat to Biblical 'Tower of Babel' Faster Than We Can Patch Pope Leo denounces ‘culture of power’ driving rise of AI Pope Leo Issues AI Encyclical Warning Against 'Opaque Algorithms' Pope Leo’s ‘Magnifica humanitas’: AI must serve humanity not concentrate power The AI Era Is Creating a Bug Hunting Arms Race The AI-Native Developer – Queue Show HN: An open-source, interactive AI engineering syllabus (1,100 papers) 教皇利奥警告称,应防止人工智能“统治人类” Mark Zuckerberg's Right-Hand Man Who's Unleashing AI at Meta GitHub - Espenandreass1/agentslice: A Markdown workflow kit that makes Cursor, Claude Code, Codex and Windsurf ask before they edit. Show HN: I Built a Debugging Challenge for the AI Coding Age Gemma 4: A new, budget-focused model in Posit AI Pope Leo warns AI revolution driven by ‘idolatry of profit’ My AI agent called my code shit and took an unannounced vacation mid-sprint HTML Deployer: 1-Click AI Code To Website Publisher - Chrome 应用商店 College Kids Don't Want Your AI [video] How I Used AI to Untangle a Legacy Service I'd Never Touched Before — The AI Leverage Weekly Greetings, Class of 2026 Have You Heard About AI? Wait, Why Are You Booing? AI guardrails stripped from Meta and Google models in minutes Uvora Growth OS – AI marketing automation and lead generation platform The Essential Cloud for AI: Why Purpose-Built Defines the Future of Intelligence No, AI is not making software worse, people are - Raphael Amorim If you let AI do your writing, I will come to your house and kill you Why The AI Boom Is Reshuffling The Global Stock Market Hierarchy AI Makes Adding Features Faster - So Why Not Add Just One More? Ask HN: How to get back into programming without AI? How Claude's AI model may cause security issues for your money Kevin O'Leary wants to build a massive AI data centre in Utah. Some residents aren't happy My AI coding flow was burning tokens to do things code should do Show HN: Live AI music sequencing agent The Dark Between the Stars GitHub - lynote-ai/humanize-text: Free open-source AI text humanizer to convert AI-generated content into undetectable, human-like writing. Bypass Turnitin, GPTZero, and all major AI detectors. No sign-up required. Try our unlimited free online tool Sign in Nobody Wants AI Anymore [video][12 mins] AI Has Taken Over Open Source How to Teach AI the "Taste" Global AI Diffusion: Q1 2026 Trends and Insights [pdf] HN: Silau – AI detects employee burnout" How AI Talks People Out of Conspiracy Theories–and What We Can Learn from That What to know about the AI models that are jolting Washington AI for design needs solving | by Megha Agrawal Client Challenge Predicting AI job exposure — Benedict Evans Google has seriously leaned into AI enshittification lately AI is becoming increasingly unpopular AI-Driven Design Automation What's Left for AI-Assisted Coding GitHub - Totes-MickGOATs/mcgoats-game-template: AI-powered game development template with CI/CD, auto-merge queue, TDD enforcement, 3-layer master protection, and 50+ skills for Godot/Unity/Unreal Vericoding: The End of "Trust Me Bro, The AI Wrote It". Bone Keeper AI Assisted Feature Film – Barrett Sonntag Nuance in all things. A dive into (Anti-) “AI” Myths AgentGate — Trust Authorization for Autonomous AI Agents AI is learning to fly airplanes – and aviation is starting to embrace it GitHub - oldrich-research/gravitational-constant-relation: A high-precision phenomenological relation for Newton's gravitational constant: G = (4/3)(hbar c / m_e^2) alpha^21 exp(-5 alpha/2). Companion to Zenodo DOI 10.5281/zenodo.20120946. Research performed by AI agents under named author's direction.
GitHub - bitomule/musts: The validation loop that stops AI coding agents from claiming work is done before it actually is.
bitomule · 2026-05-26 · via Hacker News - Newest: "AI"

musts logo

CI Crates.io MSRV License: MIT OR Apache-2.0

AI agents are fast at editing code. They are less reliable at knowing when verification is actually finished.

musts gives your repository a local, enforceable definition of done:

The task is not done until musts validate is empty.

A musts validate output showing two pending tasks with the exact cargo commands needed to resolve them

Instead of hoping the agent remembers every build, test, UI check, and architecture rule, you declare those checks next to the code they protect. When files change, musts validate reports the exact validation tasks still pending. The agent runs them, records evidence, and repeats until the report is clean.

Get Started

1. Install the CLI

# Homebrew (macOS / Linux)
brew install bitomule/tap/musts

# Cargo (from crates.io)
cargo install musts --locked

# Precompiled binaries
cargo binstall musts        # or download directly from GitHub Releases

2. Create your first MUSTS.yml

Put a MUSTS.yml at the root of your repo:

checks:
  test:
    uses: cargo/test

Now ask what still needs to be validated:

musts validate

If code covered by that manifest has changed, musts returns a concrete task for the agent to complete. After running the requested command, the agent records evidence:

cargo test --workspace 2>&1 | tee /tmp/musts-cargo-test.log

musts evidence cargo-test-root \
  --text "cargo test --workspace passed" \
  --asset /tmp/musts-cargo-test.log

musts validate

When musts validate is empty, the repo has fresh evidence for the current workspace state.

3. Tell your agent to obey the loop

For Claude Code, install the plugin. It bundles the musts skill and a Stop hook that runs musts validate whenever Claude tries to finish a turn:

/plugin marketplace add bitomule/musts
/plugin install musts@musts

See docs/claude-code-plugin.md for install, update, uninstall, and private-fork details.

For other agents, add the rule to your AGENTS.md, CLAUDE.md, or equivalent repo instructions:

Before declaring a code change done, run `musts validate`.
Treat every reported task as required. Run the task, capture evidence outside
the workspace, submit it with `musts evidence`, and repeat until
`musts validate` is empty.

The CLI is agent-agnostic. Anything that can run shell commands can participate in the loop.

What You Can Encode

musts is not limited to "run the test suite". A check can represent any validation rule your repo needs before an agent is allowed to stop.

Build and test checks

checks:
  fmt:
    uses: cargo/fmt
  clippy:
    uses: cargo/clippy
  test:
    uses: cargo/test

Targeted build checks

checks:
  app-build:
    uses: bazel/build
    with:
      target: //App:App

Product or architecture contracts

Use the built-in agent capability when the validation is a judgement call that needs a human-readable answer rather than a command exit code:

checks:
  usecase-shape:
    uses: agent
    paths:
      - "Sources/App/UseCases/**"
    with:
      facts:
        - "Every use case has exactly one public entry point."
        - "The entry point name describes the user action, not implementation detail."
        - "No use case reaches across module boundaries except through declared ports."

When a matching use case changes, musts validate asks the agent to verify those facts and submit a text explanation. That makes repo-specific rules visible, repeatable, and hard to forget.

UI and device checks

musts can also gate flows that need screenshots, videos, JSON reports, or other assets. Built-in and third-party capabilities decide what evidence they need; the agent should follow the evidence: and submit: lines in the musts validate report.

How The Loop Works

The musts loop: agent edits code, musts validate, run tasks and capture evidence, submit evidence, repeat until empty

  1. You place MUSTS.yml files next to the code they protect.
  2. The agent edits code.
  3. musts validate fingerprints the relevant files and finds checks whose current scope is no longer covered by accepted evidence.
  4. Each capability turns dirty checks into concrete tasks.
  5. The agent runs those tasks and submits evidence with musts evidence.
  6. The loop repeats until musts validate is empty.

The ledger is content-based, not git-based. Comments, generated fixtures, architecture docs, and source files all count if they are inside a check's scope. That conservative model is intentional: musts does not guess whether a change was "semantic enough" to need validation.

Why musts?

Why not just run cargo test or a Makefile?

Because the agent has to remember to do it. make all is a suggestion; musts validate is a contract the turn cannot close around. The task list is generated from what actually changed, so the repo does not need one giant script for every possible validation path.

Why not a pre-commit hook or CI-only check?

Pre-commit hooks can be skipped. CI runs after the agent has already stopped, you have already moved on, and the context needed to fix the issue may be gone. musts runs in the gap between "the agent says done" and "you believe it".

Why not just trust the agent?

Agents are good at finishing turns. They are not always good at finishing work. musts makes a false "done" visible by moving verification into an external, repo-owned loop.

A comparison of two terminal sessions: on the left, an agent says 'done' without running any checks. On the right, the same agent runs musts validate, sees a cargo test task is still pending, runs it, and only then closes the turn.

Built-In Capabilities

The reference capabilities are built into the musts binary:

Capability Use it for
agent Text-backed contracts, architectural checks, manual reasoning tasks
cargo/fmt cargo fmt --check
cargo/clippy cargo clippy --workspace --all-targets -- -D warnings
cargo/test cargo test --workspace
bazel/build Bazel target builds
mav/expect Mobile Agent Verifier flows and device evidence

Third-party extensions can add new capabilities in any language that speaks the JSON-over-stdio protocol. See docs/extensions.md and the worked example in docs/examples/eslint-check/.

Example: This Repo

musts validates itself on every PR. Its root manifest gates formatting, linting, and tests; the protocol crate also carries an agent contract for facts that should remain true across changes.

That dogfood loop is intentionally the same loop users run:

cargo build --release
./target/release/musts validate
# run the reported tasks
./target/release/musts evidence <task-id> --text "..." --asset /tmp/log
./target/release/musts validate

For contributor commands, release rules, and the required pre-PR validation sequence, see CONTRIBUTING.md.

Used At

musts runs in production on:

  • Undolly - finding duplicate photos
  • Boxy - organising physical items
  • HiddenFace - privacy-first face blur

Commands

musts validate                                 # report pending validation tasks
musts validate --json                          # machine-readable report
musts evidence <task-id> --text "..." \        # record evidence for a task
    --asset path/to/log --asset path/to/screen.png

Exit codes:

  • validate: 0 clean, 1 pending tasks, 2 configuration / stale / lock error, 70 internal error.
  • evidence: 0 accepted, 1 rejected by extension, 2 unknown task / stale snapshot / over-claim, 70 internal error.

Stability

musts is pre-1.0. The CLI surface, extension protocol, and MUSTS.yml schema may change between minor versions until 1.0. The validation loop is already used by this repository and by production apps, but you should expect some API movement while the format settles.

Docs

Start at docs/README.md for the documentation index.

Advanced topics:

  • .mustsignore - exclude committed generated files or canonical fixtures from scope hashes.
  • CONTRIBUTING.md - build, test, release, and PR-title rules.

License

Licensed under either of Apache License, Version 2.0 or MIT license at your option.

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in this project by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.