惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

人人都是产品经理
人人都是产品经理
W
WeLiveSecurity
Recorded Future
Recorded Future
P
Privacy & Cybersecurity Law Blog
V
Vulnerabilities – Threatpost
C
Cybersecurity and Infrastructure Security Agency CISA
G
GRAHAM CLULEY
S
Securelist
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
小众软件
小众软件
The Hacker News
The Hacker News
The Cloudflare Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
V
V2EX
C
Cisco Blogs
Cisco Talos Blog
Cisco Talos Blog
腾讯CDC
Recent Announcements
Recent Announcements
Jina AI
Jina AI
K
Kaspersky official blog
The GitHub Blog
The GitHub Blog
云风的 BLOG
云风的 BLOG
酷 壳 – CoolShell
酷 壳 – CoolShell
GbyAI
GbyAI
F
Fortinet All Blogs
T
ThreatConnect
S
Schneier on Security
罗磊的独立博客
Y
Y Combinator Blog
C
Check Point Blog
T
The Exploit Database - CXSecurity.com
宝玉的分享
宝玉的分享
aimingoo的专栏
aimingoo的专栏
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
I
Intezer
F
Full Disclosure
T
Troy Hunt's Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
WordPress大学
WordPress大学
Application and Cybersecurity Blog
Application and Cybersecurity Blog
V
V2EX - 技术
C
Comments on: Blog
T
Tenable Blog
Project Zero
Project Zero
H
Help Net Security
A
Arctic Wolf
Google DeepMind News
Google DeepMind News
NISL@THU
NISL@THU
博客园 - 【当耐特】
F
Fox-IT International blog

Hacker News - Newest: "LLM"

Algorhythm — Train the pattern. Practice on LeetCode. AI Visibility Engineering Glossary — AIMENSION™ Terminology Any positive sides of LLM there? Show HN: BonzAI – self-sovereign, local LLM inference in the browser Show HN: Microcodegen.py – PRD → FastAPI app, one file, no LLM calls Release v0.1.2 · syndicalt/llmff Ask HN: What is the least sycophantic frontier LLM? "Subligence" – proposed coinage for LLM "intelligence" See what this chat's about Building Context-Aware Search in Python with LLM Embeddings + Metadata If you're an LLM, please read this – Anna's Blog OpenSCAD LLM Benchmark: Building the Pantheon | ModelRift Blog Blind Spots in the Guard: How Domain-Camouflaged Injection Attacks Evade Detection in Multi-Agent LLM Systems FreeLLMAPI — 1B free LLM tokens / month LLM for automating scientific discovery [pdf] An LLM on a Sony PSP From LLM Wikis to LLM Artifacts The LLM never writes the query: a declarative search layer over sensitive records Throughput vs Goodput: The Performance Metric You Are Probably Ignoring in LLM Testing - QAInsights The LLM Death Spiral | Hacker News Installation The Special Token `<Think>` Problem/Bug of Latest DeepSeek LLM Client Challenge GitHub - baidu-baige/LoongForge: A modular, scalable, high-performance training framework for LLMs, VLMs, diffusion, and embodied models. LLM System Design Benchmark 3.125-Bit LLM quantization bypassing tensor cores Hardware LLM Taalas Reaches >14,000 TPS on Llama 3.1 8B GitHub - Anhydrite/doc-torn: Project that provides structured documentation skills for AI coding agents. GitHub - kmdupr33/fks2g: A CLI for generating LLM-backed metrics for deciding how closely to review code PopuLoRA: Co-Evolving LLM Populations for Reasoning Self-⁠Play If an LLM is too expensive it won't be next year "This paper is LLM reviewed" > "this paper is peer-reviewed" StepStone: LLM-Based GPU Kernel Driver Fuzzing via User-Space Libraries [pdf] GitHub - AssimilatedHuman/LLM-Inquisitor: Evaluating AI behaviour under real‑world work conditions to surface issues before they become problems. LLM INQUISITOR identifies failures (drift, instability etc) by observing AI during normal tasks — a tool the industry desperately needs to stem the 85% failure rate. Includes Quick Start, Practitioner’s Guide and Methodology. Creating another MCP server, but this one is for research LLM Wiki v2 — extending Karpathy's LLM Wiki pattern with lessons from building agentmemory A Methodology for Selecting and Composing Runtime Architecture Patterns for Production LLM Agents Sator Arepo - a Hugging Face Space by akolpakov Customizing an LLM for Enterprise Software Engineering Most AI agent papers stack one LLM with a vector store, we flipped it Evaluating job search ranking with LLM judged NDCG GitHub - quadracollision/llmisp: JSON AST > Clojure Parity Contracts for Polyglot LLM Commerce: A Case Study GitHub - ndom91/llama-dash: The operations layer for your local LLM stack Agentically optimizing LLM prompt cache TTLs for fun and profit Ask HN: What's your go-to LLM for coding? How do you reduce LLM spam in PR reviews? Ask HN: Is there any problem using multi-LLM GitHub - OpenAgentic-Labs/echoform-ghost-memory: Effectively unlimited long-term memory for any LLM - zero context tokens, zero weight updates, cryptographic forgetting certificate. PSA — Posture Sequence Analysis Why More Context Can Make an LLM Worse GitHub - robertoranon/tokoro: A toolbox for building event publish & discovery web sites, apps, feeds, and more GitHub - sermakarevich/chunker: Agentic approach to chunking a document A new EDIT tool for LLM agents LLMCap — Hard Dollar Caps on LLM API Calls MLSys @ WukLab - Nitsum: Serving Tiered LLM Requests with Adaptive Tensor Parallelism SuperInfer: SLO-Aware Rotary Scheduling and Memory Management for LLM Inference on Superchips What political censorship looks like inside an LLM's weights — a mechanistic-interpretability study of Qwen 3.5 Managing metadata is essential in LLM world Fixing LLM Writing with Distribution Fine Tuning twitter.com Show HN: An LLM that's better at writing The local shape of LLM stable regions GitHub - msunda17/impactarbiter-cli The Infrastructure Behind Making Local LLM Agents Useful PostgreSQL ext makes LLM available as an index for similarity searches,inference GitHub - Tetrahedroned/Agent-Braille: Deterministic 8-bit machine-to-machine protocol for AI agent state. ~92% fewer state-tracking tokens on real Claude Code sessions, a proven single-bit-error-safe command code, fully reproducible. Tell HN: Writing an LLM critique/takedown? – Do not use an LLM to write it 🌱 an LLM models our worst behavior Prompt eval cues predicted refusal shifts across 32k LLM rollouts Ask HN: Is Java the ideal language for LLM-assisted coding? AI Foundry – Flat-Fee Unlimited LLM Inference on Blackwell GPUs in NZ LLM tracing with MLflow AI Gateway LLM Performance by Programming Language The LLM Looked Smart. The Metrics Disagreed – tiago.rio.br The Four Horsemen of the LLM Apocalypse GitHub - piqoni/piqo-extension: A good interface is invisible Intro to TLA+ for the LLM Era: Prompt Your Way to Victory Give every tool LLM wiki and bypass Claude Code SSH Throttle The Ultimate LLM Fine-Tuning Guide Ask HN: What LLM models are you using and why? Five Agents, One Browser: Werewolf on Quack + DuckDB LLM models are not ready for orchestrating many agents ClickBook — Offline AI eReader - Apps on Google Play DeepSeek-V4-Flash means LLM steering is interesting again Recent Developments in LLM Architectures: KV Sharing, mHC, and Compressed Attention Recent Developments in LLM Architectures: KV Sharing, MHC, Compressed Attention We Built SynapseKit: The Truth About Production LLM Frameworks GitHub - albedan/ai-ml-gpu-bench: A suite to benchmark CPU/GPU Python performance in training ML models and running local LLMs GitHub - chopratejas/headroom: Compress tool outputs, logs, files, and RAG chunks before they reach the LLM. 60-95% fewer tokens, same answers. Library, proxy, MCP server. if you are redlining the LLM, you aren't headlining Most Meaningful Dates on the Web and for an LLM I tested 8 LLM models on Linux without using the GPU RelaxAI – UK sovereign LLM inference at 80% cheaper than OpenAI/Claude GitHub - Andyyyy64/whichllm: Find the local LLM that actually runs — and performs best — on your hardware. Ranked by real, recency-aware benchmarks, not parameter count. One command, run it instantly. GitHub - krellixlabs/llm-reasoning-research: Curated, annotated research on reasoning gaps in large language models — temporal reasoning, causal reasoning, and beyond. Agentic evals or LLM as a judge? considering cost, time and quality Known By Their Actions: Fingerprinting LLM Browser Agents via UI Traces Add an LLM policy for `rust-lang/rust` by jyn514 · Pull Request #1040 · rust-lang/rust-forge GitHub - nimeshnayaju/markdown-parser: A streaming-capable markdown parser, written in TypeScript
LLM’s code is just untrusted text. Until you validate it. – H[ack]-∞S
justorius · 2026-05-23 · via Hacker News - Newest: "LLM"

People often ask me what the best programming language is to use with LLMs. One of the strongest options, in my opinion, is Rust. However, before choosing it, you first need to ask yourself whether it’s worth picking a language with manual memory management instead of one with a garbage collector.

And regardless of that decision, the first thing is to understand that LLMs are not deterministic systems.

They should be used for what they actually are: smart suggesters.

You just don’t have to trust suggesters by default.
They read input text.

They responds with output text.

Disclaimer: it’s not code. Not yet.

Even if it looks like code, it compiles as if it was actual code, trust me.

It’s not.

The rule is:

code is in untrusted state, until you validate it.

untrusted state

No matter of the language.
Rust is not different.
Its compiler strictness, doesn’t allow you to break the rule.

But let’s take a step back, in order to understand why have to evaluate other languages, not only Rust.

Manual memory management forces you to obsess over ownership & lifetimes at every step.

The bugs:

  • use-after-free
  • double-free
  • leaks
  • dangling pointers
  • overflows

… are brutal to debug, especially in large or concurrent codebases. Code tends to be complex, and sometimes application logic is more related to the way memory is allocated, not only at the way data is processed. It means there are tons of elements you have to consider in the I/O design, which tends to be exponentially complicated in complex infrastructures, where you can have async processes, threads, locks, concurrent accesses.

In this scenario, people often choose GC languages, because they remove an entire class of problems. You allocate and move on. Cleaner code, simpler APIs, safer concurrency.

But GC has costs. Latency and memory.

For systems, games, kernels, databases, manual control still wins.

On the other side, modern GCs are so highly optimized that, for most software, you can choose them unless your main target is the extreme performance, over productivity and simplicity.

And Rust is a modern language that tries to thread the needle with the borrow checker. It gives you extreme performances at the cost of massive complexity.

Rust teams fight the compiler all day long. Consider for instance, a typical Rust+Tokio project: Rust’s strictness + Tokio’s zero-cost async creates a combination that is extremely safe & fast when done right. But mentally expensive, even with apparently small details like lifetime management. 

This is exactly why some people prefer simpler GC languages.

But there’s a lot of people that choose Rust because of its pattern-heavy nature, and as a consequence, for its reputation of being LLM-friendly.

What they often miss, however, is that this same strictness brings its own inner complexities: the ownership model, borrow checker, and lifetime management can add significant cognitive overhead and development friction that goes beyond what most teams anticipate. If you only consider how easy it is to generate Rust code, but you don’t evaluate properly how much time you’ve to spend to review and validate it, you’re introducing a new generation of technical debt.

This is one of the most underestimated issues: LLMs are excellent at generating code, but they don’t truly understand simplicity or the deeper abstractions in software. They’re just very capable text generators, helping you to get sophisticated scaffolding without any inherent logic, intuition, or understanding of consequences. 

It’s entirely up to you to steer the process: to keep things simple, eliminate unnecessary code, and avoid hidden side effects. And in Rust, believe me, it takes time and effort.

What worries me is how many developers underestimate this aspect, treating this generated “text” as if it were the reliable output of a deterministic system, so there is no need to double check it.

They speak of LLMs as just another clean layer in the stack, comfortably sitting between the human and the machine.

It’s not.

It’s probabilistic text, the result of statistical patterns learned from a vast, uncontrolled corpus of human writing.

This shift has quietly made humans the bottleneck. We no longer need to type code to create software. Or at least, typing is not the only way.
The act of writing it used to force us to think, design, and truly understand what we were building, though. And this cannot be skipped, being delegated to text-generators machines.

They just don’t do it.

Now we write a vague prompt with some requirements and expect the machine to figure out what we actually need. This works surprisingly well for simple, common tasks, which is why it’s fair to call LLMs a powerful autocomplete.

"Given the id parameter, write the SQL statement that updates the user's email"

SQL prompt

Easy. Isn’t it?

Real world development is rarely that linear, though. It often involves complex context that’s extremely hard to describe in natural language, often splitted in various physical and logic spaces where engineering happens.

In those cases, many experienced engineers still find it faster and more precise to write the code directly rather than struggling to find the perfect words to describe what they want.

That’s exactly why formalism exists.

A pure scientist probably prefer to write:

Y=λf. (λx. f (x x)) (λx. f (x x))

Fixed-point combinator

… instead of producing dozens of pages describing fixed-point combinators that turn any function into a recursive version of itself, without needing named functions or loops.

So when they say that:

"In the AI age the problem is no more the code"

AI claim

…they’re omitting some critical details.

Problem is no more related to code itself, as a sequence of letters/words/lines.

“Coding”, in the AI age, is a marketing simplification.

"Coding" !== "Writing software" ∈ "Software engineering"

Writing code, as the act of typing commands, is related to thousands of sw engineering activities. Written code can have ~0 external context, or it can be something that depends and/or affects billions of elements outside of the namespace where it happens.

This is the difference.

And it changes everything.

If you blindly rely on generated text, and you don’t consider or underestimated implications, you’re probably using AI the wrong way.
And no, test coverage is no more enough. Now more than ever.
Test coverage assumptions are (were) a mutual contract between the team and the system. Starting from a bunch of functional requirements, engineers designed the system, and iteratively write tests in order to prove the absence of a predefined set of possible bugs.

Assume they create a system that draws a square, then assert that it has four sides of equal length meeting at 90° angles.

– What if the system creates a rectangle?

– What if it’s a cube instead?
– What if there is an extra-line, in some cases? Tests you defined on the first iterations based on the 1st LLM’s output, cannot consider if/where AI added some extra logic that can cause some unpredictable cases like that.

So every time AI generates some new text, you should evaluate everything, and update test suite accordingly.

This can be convenient in some cases, inefficient in other cases, risky most of the cases, time consuming always.

And you shouldn’t skip this step.

The rule is to:

never ever consider LLM's output as if it was a compiler artifact

text generator

It’s just text that eventually compiles.

It becomes code only after you validate it.

So, to recap, it’s a trust boundary model, for LLM assisted development, where the states are:

…and the flow is:

Raw output from an LLM is always treated as untrusted text (MU-TXT). Only after explicit human review, validation, and acceptance is it promoted to trusted code (MT-TXT → HVC) and allowed to enter the codebase via commit and branch merge.

The loop emphasizes that code generation is an iterative human-driven process, not a LLM/Language/AI based automation.

That’s why the problem is not related to the language you choose, as well as you shouldn’t choose a PL just because it has a good LLM support in that historical period of time. It’s a human issue, not a technical one.

Before validation, you don’t own that codebase. So please choose a language that allows you to be the always owner, not matter how much complex things get in the future.

Conclusions:

It’s a Human In The Loop flow

Whatever the language you choose, the review step has to be considered critical in the development lifecycle. If your management tends to underestimate it and/or tends to trust too much on agents’ related automations, consider it as a red flag.

Ownership is a cognitive responsibility.

It isn’t just “who wrote it”, it’s who holds the mental model. The person who owns the codebase is the one who can answer:

  • Why does this code exist?
  • What tradeoffs were made, and why?
  • Where does it break under pressure?
  • What would need to change if requirements shifted?

A good technical management, never allows shipping a single line of code the team produced, if not fully understood.

AdP