惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

Hacker News - Newest: "LLM"

PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-⁠Play If an LLM is too expensive it won't be next year "This paper is LLM reviewed" > "this paper is peer-reviewed" StepStone: LLM-Based GPU Kernel Driver Fuzzing via User-Space Libraries [pdf] GitHub - AssimilatedHuman/LLM-Inquisitor: Evaluating AI behaviour under real‑world work conditions to surface issues before they become problems. LLM INQUISITOR identifies failures (drift, instability etc) by observing AI during normal tasks — a tool the industry desperately needs to stem the 85% failure rate. Includes Quick Start, Practitioner’s Guide and Methodology. Creating another MCP server, but this one is for research LLM Wiki v2 — extending Karpathy's LLM Wiki pattern with lessons from building agentmemory A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents Sator Arepo - a Hugging Face Space by akolpakov Customizing an LLM for Enterprise Software Engineering Most AI agent papers stack one LLM with a vector store, we flipped it Evaluating job search ranking with LLM judged NDCG GitHub - quadracollision/llmisp: JSON AST > Clojure Parity Contracts for Polyglot LLM Commerce: A Case Study GitHub - ndom91/llama-dash: The operations layer for your local LLM stack Agentically optimizing LLM prompt cache TTLs for fun and profit Ask HN: What's your go-to LLM for coding? How do you reduce LLM spam in PR reviews? Ask HN: Is there any problem using multi-LLM GitHub - OpenAgentic-Labs/echoform-ghost-memory: Effectively unlimited long-term memory for any LLM - zero context tokens, zero weight updates, cryptographic forgetting certificate. PSA — Posture Sequence Analysis Why More Context Can Make an LLM Worse GitHub - robertoranon/tokoro: A toolbox for building event publish & discovery web sites, apps, feeds, and more GitHub - sermakarevich/chunker: Agentic approach to chunking a document A new EDIT tool for LLM agents LLMCap — Hard Dollar Caps on LLM API Calls MLSys @ WukLab - Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips What political censorship looks like inside an LLM's weights — a mechanistic-interpretability study of Qwen 3.5 Managing metadata is essential in LLM world Fixing LLM Writing with Distribution Fine Tuning twitter.com Show HN: An LLM that's better at writing The local shape of LLM stable regions GitHub - msunda17/impactarbiter-cli The Infrastructure Behind Making Local LLM Agents Useful PostgreSQL ext makes LLM available as an index for similarity searches,inference GitHub - Tetrahedroned/Agent-Braille: Deterministic 8-bit machine-to-machine protocol for AI agent state. ~92% fewer state-tracking tokens on real Claude Code sessions, a proven single-bit-error-safe command code, fully reproducible. Tell HN: Writing an LLM critique/takedown? – Do not use an LLM to write it 🌱 an LLM models our worst behavior Prompt eval cues predicted refusal shifts across 32k LLM rollouts Ask HN: Is Java the ideal language for LLM-assisted coding? AI Foundry – Flat-Fee Unlimited LLM Inference on Blackwell GPUs in NZ LLM tracing with MLflow AI Gateway LLM Performance by Programming Language The LLM Looked Smart. The Metrics Disagreed – tiago.rio.br The Four Horsemen of the LLM Apocalypse GitHub - piqoni/piqo-extension: A good interface is invisible Intro to TLA+ for the LLM Era: Prompt Your Way to Victory Give every tool LLM wiki and bypass Claude Code SSH Throttle The Ultimate LLM Fine-Tuning Guide Ask HN: What LLM models are you using and why? Five Agents, One Browser: Werewolf on Quack + DuckDB LLM models are not ready for orchestrating many agents ClickBook — Offline AI eReader - Apps on Google Play DeepSeek-V4-Flash means LLM steering is interesting again Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention Recent Developments in LLM Architectures: KV Sharing, MHC, Compressed Attention We Built SynapseKit: The Truth About Production LLM Frameworks GitHub - albedan/ai-ml-gpu-bench: A suite to benchmark CPU/GPU Python performance in training ML models and running local LLMs GitHub - chopratejas/headroom: Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server. if you are redlining the LLM, you aren't headlining Most Meaningful Dates on the Web and for an LLM I tested 8 LLM models on Linux without using the GPU RelaxAI – UK sovereign LLM inference at 80% cheaper than OpenAI/Claude GitHub - Andyyyy64/whichllm: Find the local LLM that actually runs — and performs best — on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it instantly. GitHub - krellixlabs/llm-reasoning-research: Curated, annotated research on reasoning gaps in large language models — temporal reasoning, causal reasoning, and beyond. Agentic evals or LLM as a judge? considering cost, time and quality Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces Add an LLM policy for `rust-lang/rust` by jyn514 · Pull Request #1040 · rust-lang/rust-forge GitHub - nimeshnayaju/markdown-parser: A streaming-capable markdown parser, written in TypeScript Dragos Documents First LLM-Assisted Strike on Water Infrastructure in Mexico Alchemize: PyMC's model to replace Stan/PyMC, etc. with an LLM BlitzGraph - The AI-native backend. Pokémon SVG Bench LLM Witch Hunts are getting F'in Irritating bliki: Interrogatory LLM Ctx-opt: TypeScript middleware to trim LLM chats to a token budget Show HN: Local-first Kubernetes YAML visualizer (no server, no LLM) Why Ruby Is the Better Language for LLM-Powered Development Paper page - Geometry Conflict: Explaining and Controlling Forgetting in LLM Continual Post-Training Show HN: Asciidia – LLM-Powered Game State media control shapes LLM behaviour by influencing training data Small Model Forensics How LLM Inference Works Multi-LLM AI trading agent harness GitHub - crawshaw/yeah: yeah: LLM-powered yes/no CLI tool Predicting Rare LLM Failures with 30× Fewer Rollouts — LessWrong Mechanism Design for Quality-Preserving LLM Advertising I tried to put an on-device LLM in an iOS Share Extension. It didn't fit Show HN: Gox – Strict static analyzer for Go designed for LLM-written code GitHub - torrix-ai/install Show HN: MCPSafe – Free security scanner for MCP servers using 5-LLM consensus Ada-MK: Adaptive MegaKernel Optimization via Automated DAG-based Search for LLM Inference Atlas Inference Engine Hi-Vis: one-shot jailbreak disguised as LLM "software patch" reaching 100% ASR Loading/running every LLM with 4M ctx in 3 clicks Free AI Leak Checker — Is Your Prompt Leaking Data? GLiGuard: 16x Faster Safety Moderation with a Small Language Model - Pioneer AI by Fastino Labs Are LLM Useful for Solo Founders
GitHub - JordanCT/VigIA-Orchestrator
2026-04-10 · via Hacker News - Newest: "LLM"

VigIA: Deterministic State Machine for LLM Orchestration

Runtime: .NET 10 Native AOT Paradigm: Deterministic FSM License: Apache 2.0

LLMs are probabilistic engines. Forcing them to manage application state or business logic leads to hallucination spirals. VigIA takes state management away from the LLM, demoting it to a purely functional heuristic parser bounded by a strict, deterministic Finite State Machine (FSM) built in .NET 10 (Native AOT).

If the LLM hallucinates a requirement, violates a Domain-Driven Design (DDD) boundary, or if the user evades providing business rules, the .NET Orchestrator intercepts the payload, discards the mutation, rolls back the chat history, and issues a [NACK] (Negative Acknowledgment).

VigIA Execution Demo (Demo: The Judge agent intercepting user evasion, the FSM issuing a Cognitive Strike, and forcing the user to provide strict domain rules).

1. The Homelab Context

This architecture was developed and stress-tested on Ubuntu 24.04 LTS with a local RTX 5090 running Gemma 4. It is designed for zero-latency local inference, proving that enterprise-grade deterministic orchestration can run entirely on edge hardware without relying on external, rate-limited APIs.

2. The Technical Problem: State Corruption

Standard LLM agents (like LangChain wrappers) suffer from State Corruption. When an LLM fails to generate a valid JSON or hallucinates a business rule, the failure is appended to the chat history. In the next turn, the LLM reads its own garbage output, assumes it as ground truth, and spirals into an unrecoverable state.

Furthermore, LLMs suffer from Sycophancy. If a business user asks for a "Stock Reservation" system but provides no specific rules, the LLM will hallucinate a textbook CRUD response just to appease the user, silently bypassing critical domain invariants like Reservation TTL (Time-To-Live) and cancellation release triggers.

3. The Solution: FSM Interceptor & Snapshot Rollbacks

VigIA solves this through a strict architectural pipeline (VigiaAgentOrchestrator):

  1. Structured State Vector (SSV): Chat history is purged of state. Global context is maintained via an immutable StructuredStateVector. Only successfully resolved Aggregate Roots are injected into the prompt, preventing context window poisoning.
  2. Deterministic Cross-Validation (InvariantEnforcer): The LLM's JSON output is never trusted. It is routed through a strict validation layer. It doesn't just check syntax; it enforces DDD invariants. Does the payload have an Actor and an Action? Does it contain exhaustive Gherkin criteria (at least one Happy Path and one Edge Case)? If not, the transition is aborted.
  3. Snapshot Rollbacks (ChatHistoryManager): When the InvariantEnforcer returns a failure (via a strict Result<T, E> monad), the FSM performs a snapshot rollback. The LLM is punished with a "Cognitive Strike" prompt detailing its exact structural error, but its previous hallucination is erased from the chat history.
  4. The 3-Strike[NACK] Rule: If the user evades questions (detected by the Judge agent) or the Synthesizer LLM fails to pass the InvariantEnforcer 3 consecutive times, the FSM triggers a Hard Block and halts the system.

4. Why .NET 10 Native AOT? (The Engineering)

VigIA is not a Python script relying on mutable dictionaries. It is a compiled, memory-safe binary designed for production orchestration.

  • Zero Reflection: JSON serialization is handled entirely by C# Source Generators (OrchestrationJsonContext).
  • Memory Safety: The SSV relies on readonly record structs and ImmutableDictionary. State transitions generate mathematically distinct vectors, preventing memory leaks during long elicitation sessions.
  • Monadic Error Handling: Exceptions are not used for control flow in the hot path. The Result<T, E> monad guarantees that the Orchestrator handles every possible LLM failure deterministically at compile-time.

5. Architecture Blueprint

Vigia.RequirementsMaster.slnx/
│
├── src/Vigia.Core.Domain/
│   # 🛡️ LAYER 1: The Deterministic Core. Zero dependencies.
│   ├── Aggregates/           # ArtifactRequirement, SystemBlueprint.
│   ├── Exceptions/           # DomainAmbiguityException, RejectionReason.
│   ├── Monads/               # Result<T, E> forcing compile-time NACK handling.
│   ├── StateVectors/         # StructuredStateVector (SSV) immutable snapshots.
│   └── ValueObjects/         # SystemId, ArtifactId, VectorVersion.
│
├── src/Vigia.Agent.Orchestration/
│   # 🧠 LAYER 2 & 3: Cognitive auditing and FSM state-shift management.
│   ├── Commands/             # CommandInterceptor (e.g., /approve handling).
│   ├── CrossValidation/      # InvariantEnforcer and strict Payload schemas.
│   ├── FSM/                  # Pure functional mapping: f(SSV, Payload) -> State | NACK.
│   ├── History/              # ChatHistoryManager (Snapshot & Rollback mechanics).
│   ├── Ports/                # ILlmInferenceEngine interface.
│   ├── Prompts/              # CognitivePrompts, VigiaBasePrompt, VigiaJudgePrompt.
│   └── Serialization/        # OrchestrationJsonContext (SourceGen for AOT).
│
├── src/Vigia.Infrastructure.LLM/
│   # 🔌 LAYER 4: Infrastructure and Inference.
│   ├── Configuration/        # LlmSettings.
│   └── Engines/              # LocalInferenceClient (vLLM/OpenAI compatible) + Strict Schemas.
│
├── src/Vigia.LiveTest/
│   # 🖥️ PRESENTATION: The interactive FSM REPL console.
│   ├── Presentation/         # ReplOrchestrator, VigiaConsoleTerminal.
│   └── appsettings.json      # SystemBlueprint injection.
│
└── tests/Vigia.Tests.Unit/
    # 🧪 VALIDATION SUITE: Mathematical proof of state immutability.

6. Execution Flow: The Inventory System Test Case

VigIA is driven by a SystemBlueprint. In this repository, we provide a real-world test case: an Inventory Management System. The FSM forces the LLM and the User to sequentially resolve 5 artifacts: CategoryManagement, ProductManagement, InboundStock, ManualStockAdjustment, and finally, the ultimate Sycophancy Trap: StockReservation.

  1. The Judge (Sensing): User input is first evaluated by a Judge LLM (Temperature 0.0). It checks for IsDomainDrift, IsEvasion, or IsMalice. If true -> NACK.
  2. The Synthesizer (Elicitation): If approved, the Synthesizer LLM evaluates if the data is complete. If missing -> Emits Type: Question.
  3. The FSM Interceptor: Once the LLM emits a Deliverable, the .NET FSM validates the Gherkin scenarios, Actor, Action, and Domain Rules.
  4. Sign-Off: The FSM locks the state and requires the user to type /approve.

7. Running VigIA Locally

Prerequisites

  • .NET 10 SDK (Preview)
  • A local LLM inference server (vLLM, LM Studio, Ollama) exposing an OpenAI-compatible API.

Setup

  1. Clone the repository.
  2. Edit src/Vigia.LiveTest/appsettings.json to point to your local LLM endpoint:
"LlmSettings": {
  "Endpoint": "http://localhost:8000/v1",
  "ModelId": "gemma-4-local",
  "ApiKey": "empty"
}
  1. Run the REPL Terminal:
cd src/Vigia.LiveTest
dotnet run -c Release

8. Background & Research

This architecture evolved from our initial academic research: "VIGÍA 4+1: Mitigating LLM Sycophancy and Normative Bias in Requirements Engineering via Deterministic Agentic Orchestration" (March 2026).

While the original paper utilized Command-R as the baseline model, this repository represents the production-grade evolution of the framework. It has been migrated to .NET 10 Native AOT, refactored with Monadic state handling, and optimized to orchestrate superior reasoning models like Gemma 4.


Built with rigorous engineering. No hype. Just state machines.