惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

Hacker News - Newest: "AI"

Your AI Tools Are Only as Good as Your Judgment — And That's the Point GitHub - shubhamgoel27/artifold: 📚 A local-first library for the stuff you make with AI. Index, search, preview, share — and use your past work as the style guide for your next one. Qualcomm strikes AI chip deal with TikTok owner ByteDance Why I Made a Journal for AI-Generated Papers — Cesar A. Hidalgo AI Billing is (mostly) token plumbing Xiaomi MiMo Api Open Platform - Token Plan Global Launch GitHub - aarifmms/keyblind: keyblind New studies find systematic religious bias in ChatGPT, other AI Meta and Google AI safety controls can be stripped in minutes, Financial Times testing finds SK hynix unveils self-cooling iHBM chips to combat AI overheating ByteDance offers AI team special stock to combat poaching GitHub - Agile-V/agile_v_skills: 🔬 Verifiable AI-Augmented Engineering Framework - Stop AI hallucinations with formal traceability (REQ→ART→TC). Agent Skills for Claude Code, Cursor, VS Code & Copilot. Enterprise-grade: ISO 9001, ISO 27001, GxP-ready. Red Team verification, multi-cycle lifecycle, behavioral anti-patterns. The Collaborative Exoskeleton of AI Science GitHub - AlphaBitCore/nexus-gateway The Five Pillars of AI Agent Accountability: A Diagnostic Framework for Engineering Leaders AI agents imperiled by critical vulnerability in open source package The Vibe Coding Era: Why AI Won't Replace Software Engineers [video] AI agents are scrambling power users' brains Ask HN: Has AI affected negatively the job market for devs? Show HN: I built a tool to auto-accept AI slop and bigtech devs loves it OpenAI admits AI hallucinations are mathematically inevitable, not just engineering flaws starlette - secwest.net - secure virtual engagement Shopify's AI Developer Sam Altman and Dario Amodei are both walking back their AI jobs apocalypse prophecies as they eye blockbuster IPOs | Fortune twitter.com Robotics giant Figure AI demonstrates its robots to the world Bay Area mom out thousands after scammers use AI to mimic daughter's voice in fake kidnapping The Swing Sensei App - App Store 6 Million Fake GitHub Stars: How to Vet Open-Source AI Tools Before You Bet on Them Why AI's Biggest Deals Price Assets Before Revenue AI chatbots show bias toward Catholicism, researchers say LMIM OS – an offline AI ecosystem. Voice, RAG, WhatsApp. ++ One file. 0 setup Authors versus AI and the risks to government public sector push There's at Least One Job That AI Isn't Killing AskMingLi: AI-assisted BaZi chart readings AI Isn't Management. Try Explaining That to Matthew Prince Who Wants to Be Hired? (May 2026) – AI Engineer (Python, RAG, Agentic Workflows) twitter.com The AI Industry Just Walked Into the Vatican Humanize – two LLM-agnostic skills to rewrite and detect AI text HypeScribe – AI-powered transcription, summaries, and search for any audio/video GitHub - NikhilSKashyap/interviewsignal: AI-native broad-interviewing. Share a code, capture thought process, auto-grade on submit. pip install, zero setup cost, pure signal. Uber burned through its entire 2026 AI budget in four months. Now its COO is questioning whether it's worth it | Fortune FlowLink: MCP proxy blocking destructive AI agent commands Blitzy AI charges by LOC generated AI-Related Issues in Securities Cases: Privilege Pitfalls, 'AI Washing' Claims AI is killing All About Berlin Pheno: AI-Powered Personalized Health Platform GitHub - rishavsunny12/harvestGuard: Lets see how claude code creatively creates a project for me NES, SNES, Genesis, VirtualBoy, and PSX | A journey with AI and Recompilation The Rise of the AI Script Kiddie Stack Overflow's forum is dead thanks to AI SpaceX's AI Pursuits Have yet to Take Off Do AI Risks Require Extraordinary Government Intervention? GitHub - Dylanchess0320/LuckyD-Code: LuckyD Code - Terminal AI Assistant / Discord - https://discord.gg/ApEKKUuKd I applied to YC with an AI-native IDE for hardware prototyping AI may be fuelling U.S. business creation, but few signs of a similar trend in Canada A Board Game agent built using Sanity Context and Vercel's AI SDK | Sanity Microsoft’s GitHub was positioned to win the AI coding race. Outages got in the way Too dangerous to release: is Mythos the start of the restricted-AI era? Show HN: Audiogen – a new take on generative music AI ScribeItLocal — Free Local Video & Audio Transcription The Three-Cylinders Problem — When AI Models Choose Beauty Over Truth Show HN: MurrDB: A RocksDB-based NVMe/S3 cache for AI inference workloads The rise of the -10x engineer: The negative side of AI productivity Safe Ways to Use AI Agents Programming Is Real Engineering, And AI Proves It What AI race? China and U.S. AI are tightly connected High-VRAM GPUs aren't the future of local AI GitHub - mbbill/mind-expander: A shared visual workspace for understanding and steering code with AI agents. Show HN: We made a cinematic heist trailer with 4 AI models for $60 Release shield-v0.7.0 · AperionAI/shield AI Startup Says It Will Pay People $2,000 a Month to Masturbate—Yes, Really MCP: Security Design Considerations for AI-Driven Automation by NSA [pdf] Rethinking organizational design in the age of agentic AI Client Challenge GitHub - takshd15/Laptop-AI GitHub - SynapCores/synapcores-agent: Real, framework-free AI support agent where SynapCores is the brain — memory, RAG, tool routing, generation in one database. Browser chat widget + live Brain debug sidebar. Fork and run in 30s. The Math Changed AI-Augmented Software Development Manifesto Whisper by Remskill — AI Voice Assistant for Desktop AI tools lead to 'clear racial disparities' in job hiring Excerpts from Pope Leo XIV's manifesto on humanity and AI | AP News GitHub - StackOneHQ/stack-nudge ‘BusPatrol’ Put AI Cameras in Tens of Thousands of School Buses. Now They Want to Give Cops Access AI Killed Stack Overflow (and why that sucks) AI-Powered Cyber Attacks in 2026: How Adversaries Are Evolving Rogue states are putting AI agents to work on sanctions evasion Show HN: Treats Human and AI the Same Seventy years of mathematics built the thing we call AI Genre glitches and unexpected promotional phrases as a sign of AI writing Reverse centaurs and the failure of AI (2021) HVTracker – trust registry for open-source AI agents The Inevitability: Why AI Cannot Be Stopped, Slowed, or Resisted WebBridge - Let Kimi Agent Drive Your Browser | Kimi RTMH: Pope Leo’s Magnifica Humanitas on AI — LessWrong GitHub - SkepticCTO/decoding_the_language_machine: Documentation, Prompts, and Media for the "Decoding the Language Machine" series Block open-sourced Goose, an AI agent that scaled to 60% of the company Beyond Recall: Behavioral Specification as an Interpretive Layer for AI Personalization GitHub - compuficial/apery: Synthetic Data Generator for Agents
When AI Writes the World's Software, Who Verifies It? — Leonardo de Moura
fagnerbrack · 2026-05-27 · via Hacker News - Newest: "AI"

2026-02-28

AI Is Rewriting the World's Software

Code Metal recently raised $125 million to rewrite defense industry code using AI. Google and Microsoft both report that 25–30% of their new code is AI-generated. AWS used AI to modernize 40 million lines of COBOL for Toyota. Microsoft's CTO predicts that 95% of all code will be AI-generated by 2030. The rewriting of the world's software is not coming. It is underway.

Anthropic recently built a 100,000-line C compiler using parallel AI agents in two weeks, for under $20,000. It boots Linux and compiles SQLite, PostgreSQL, Redis, and Lua. AI can now produce large-scale software at astonishing speed. But can it prove the compiler correct? Not yet.

No one is formally verifying the result.

Andrej Karpathy described the pattern: "I 'Accept All' always, I don't read the diffs anymore." When AI code is good enough most of the time, humans stop reviewing carefully. Nearly half of AI-generated code fails basic security tests, and newer, larger models do not generate significantly more secure code than their predecessors. The errors are there. The reviewers are not. Even Karpathy does not trust it: he later outlined a cautious workflow for "code [he] actually care[s] about," and when he built his own serious project, he hand-coded it.

Consider what happens at scale. A single bug in OpenSSL — Heartbleed — exposed the private communications of millions of users, survived two years of code review, and cost the industry hundreds of millions of dollars to remediate. That was one bug, introduced by one human, in one library. AI is now generating code at a thousand times the speed, across every layer of the software stack, and the defenses we relied on (code review, testing, manual inspection) are the same ones that missed Heartbleed for two years.

The Harvard Business Review recently documented what it calls "workslop": AI-generated work that looks polished but requires someone downstream to fix. When that work is a memo, it is annoying. When it is a cryptographic library, it is catastrophic. As AI accelerates the pace of software production, the verification gap does not shrink. It widens. Engineers stop understanding what their systems do. AI outsources not just the writing but the thinking.

The threat extends beyond accidental errors. When AI writes the software, the attack surface shifts: an adversary who can poison training data or compromise the model's API can inject subtle vulnerabilities into every system that AI touches. These are not hypothetical risks. Supply chain attacks are already among the most damaging in cybersecurity, and AI-generated code creates a new supply chain at a scale that did not previously exist. Traditional code review cannot reliably detect deliberately subtle vulnerabilities, and a determined adversary can study the test suite and plant bugs specifically designed to evade it. A formal specification is the defense: it defines what "correct" means independently of the AI that produced the code. When something breaks, you know exactly which assumption failed, and so does the auditor.

Poor software quality already costs the U.S. economy $2.41 trillion per year, according to a 2022 study by the Consortium for Information & Software Quality. That number was calculated before AI began writing a quarter or more of new code at leading companies. Chris Lattner, the creator of LLVM and Clang, put it bluntly: AI amplifies both good and bad structure. Bad code at AI speed becomes "incomprehensible nightmares." As AI generates an increasing share of the world's critical infrastructure (financial systems, medical devices, defense, transportation), unverified code becomes a systemic risk, not just a quality problem.

Most critical software today works, maintained by talented engineers using testing, code review, and hard-won experience. The problem is not that everything is broken. It is that AI is changing the scale and speed of software production faster than our ability to verify it. What works at human pace may not survive AI pace.

It will get worse, and it will get worse fast, unless verification scales with generation.

Why Mathematical Proof

Testing and proof are complementary. Testing, including property-based testing and fuzzing, is powerful: it catches bugs quickly, cheaply, and often in surprising ways. But testing provides confidence. Proof provides a guarantee. The difference matters, and it is hard to quantify how high the confidence from testing actually is. Software can be accompanied by proofs of its correctness, proofs that a machine checks mechanically, with no room for error. When AI makes proof cheap, it becomes the stronger path: one proof covers every possible input, every edge case, every interleaving. A verified cryptographic library is not better engineering. It is a mathematical guarantee.

Consider an example. An AI rewrites a TLS library. The code passes every test. But the specification requires constant-time execution: no branch may depend on secret key material, no memory access pattern may leak information. The AI's implementation contains a subtle conditional that varies with key bits, a timing side-channel invisible to testing, invisible to code review. A formal proof of constant-time behavior catches it instantly. Without the proof, that vulnerability ships to production. Proving such low-level properties requires verification at the right level of abstraction, which is why the platform must support specialized sublanguages for reasoning about timing, memory layout, and other hardware-level concerns.

The Claude C Compiler illustrates the other side: it optimizes for passing tests, not for correctness. It hard-codes values to satisfy the test suite. It will not generalize. Property-based testing would likely catch this particular case, but the general problem remains: for any fixed testing strategy, a sufficiently adversarial system can overfit to it. A proof cannot be gamed. It covers all inputs by construction.

The friction of writing code manually used to force careful design. AI removes that friction, including the beneficial friction. The answer is not to slow AI down. It is to replace human friction with mathematical friction: let AI move fast, but make it prove its work. The new friction is productive: writing specifications and models, defining precisely what "correct" means, designing before generating.

The obstacle has always been cost. Writing proofs by hand was too expensive to apply broadly. AI changes the economics. Proof is becoming practical at scale.

Accelerating Progress

Most people think of verification as a cost, a tax on development, justified only for safety-critical systems. That framing is outdated. When AI can generate verified software as easily as unverified software, verification is no longer a cost. It is a catalyst.

The value is not in the verification workforce. It is in what verified delivery enables. Consider a company delivering ML kernels for new hardware. Today, months go to testing and qualification. When AI writes the kernel and proves it correct in one pass, that timeline collapses to hours. A provably correct hardware design delivered in weeks rather than the year it currently takes changes the economics of an entire industry.

Verification, testing, and specification have always been the bottleneck, not implementation. Good engineers know what they want to build. They just cannot afford to prove it correct. If that cost drops to near zero, every domain where correctness matters accelerates. Aerospace, automotive, and medical device certification currently takes years of qualification effort. Cloud providers invest similar effort qualifying security-critical services and cryptographic implementations. Verified code generation could collapse that timeline to weeks. Hardware verification, where a single bug can cost hundreds of millions of dollars, benefits equally.

This acceleration requires specifications: precise descriptions of what the software must do. As AI takes over implementation, specification becomes the core engineering discipline. Writing a specification forces clear thinking about what a system must do, what invariants it must maintain, what can go wrong. This is where the real engineering work has always lived. Implementation just used to be louder.

Writing specifications is not always easy, but it is easier than writing the optimized implementation. And a powerful shortcut exists: an inefficient program that is obviously correct can serve as its own specification. User and AI co-write a simple model, AI writes an efficient version, and proves the two equivalent. The hard part shifts from implementation to design. That is the right kind of hard.

This is not limited to critical systems. Any non-trivial engineering project where bugs are expensive (which is most of them) accelerates when correctness is cheap.

What the Verification Platform Needs

What would a verification platform for the AI era require?

A small, trusted kernel: a few thousand lines of code that check every step of every proof mechanically. Everything else (the AI, the automation, the human guidance) is outside the trust boundary. Independent reimplementations of that kernel, in different languages (Lean, Rust), serve as cross-checks. You do not need to trust a complex AI or solver; you verify the proof independently with a kernel small enough to audit completely. The verification layer must be separate from the AI that generates the code. In a world where AI writes critical software, the verifier is the last line of defense. If the same vendor provides both the AI and the verification, there is a conflict of interest. Independent verification is not a philosophical preference. It is a security architecture requirement. The platform must be open source and controlled by no single vendor.

The platform must be both a programming language and a theorem prover, with code and proofs in one system, with no translation gap. It needs a rich and extensible tactic framework that gives AI structured, incremental feedback: here is the current goal, here are the hypotheses available, here is what changed after each step. AI must control the proof search, not delegate to a black box.

The alternative, push-button solvers that return a binary pass or fail with no intermediate state, gives AI nothing to learn from and no way to guide the search. Worse, proofs that rely on heuristic solvers often break when the solver updates or when developers make small changes to how they write their specifications, even when the changes are logically equivalent. You cannot build a reliable AI pipeline on a foundation that is not reproducible. (I discuss this in detail in a recent Stanford talk.)

It needs the largest possible library of formalized knowledge to build on. Verifying software is mathematics: the same reasoning that proves a theorem in abstract algebra proves that a cryptographic library correctly implements its specification. A platform that serves mathematicians and engineers is not a compromise. It is the recognition that rigorous reasoning is one discipline, whether applied to prime numbers or protocol correctness.

And it needs deep extensibility. Users and AI must be able to write extensions that access the system's internals, building custom tools, automation, and domain-specific reasoning engines. This is already happening: AI agents build their own proof strategies on top of the platform. The platform adapts to its users, not the other way around.

The AI community has already made its choice. AlphaProof (Google DeepMind), Aristotle (Harmonic), SEED Prover (ByteDance), Axiom, Aleph (Logical Intelligence), and Mistral AI all build on Lean. Every major AI reasoning system that has achieved medal-level performance at the International Mathematical Olympiad used Lean. No competing platform was used by any of them. The future is much larger than today's early applications.

Lean is backed by Mathlib, the largest coherent body of formalized mathematics ever created: over 200,000 formalized theorems and 750 contributors. Five Fields medalists engage with Lean. The same platform serves mathematicians formalizing theorems and engineers verifying production systems. ACM SIGPLAN recognized this convergence with its 2025 Programming Languages Software Award: "Lean has become the de facto choice for AI-based systems of mathematical reasoning."

Enterprise teams already use Lean in production: AWS verified its Cedar authorization policy engine, and Microsoft is using Lean to verify its SymCrypt cryptographic library. Over 8,000 GitHub repositories contain Lean code. Over 200,000 users have installed the programming environment. More than 700 people are active in the Lean Zulip channel every day. Research groups worldwide contribute to the ecosystem. As Chris Lattner observed, manual rewrites and translation work are becoming AI-native tasks. AI will rewrite the world's codebase. The platform it does so on matters enormously.

Lean already produces performance comparable to Haskell and OCaml. When higher performance is essential, Lean models can be translated into efficient imperative code embedded in Lean, with clean semantics and without C's undefined behavior. We are actively working on closing the remaining gap for performance-critical code. The real comparison is not Lean versus C. It is verified code versus unverified code.

Lean is the result of over twelve years of continuous development. We designed and built every layer from scratch: the trusted kernel, the compiler, the language server, the IDE, the proof automation. The team is 20 people. The community independently chose Lean: mathematicians, AI researchers, and enterprise engineers, all building on the same platform.

A Sign of Things to Come

At the Lean FRO, Kim Morrison, a Senior Research Software Engineer, recently ran an experiment that went well beyond our expectations. An AI agent converted zlib, a widely used C compression library embedded in countless systems, to Lean, with minimal human guidance. No special tooling was built. It was Claude, a general-purpose AI, with no special training for theorem proving, out of the box. The workflow had four steps. First, the AI produced a clean, readable Lean implementation of the zlib compression format, including the DEFLATE algorithm at its core. Second, the Lean version passed the library's existing test suite, confirming behavioral equivalence. Third, key properties were stated and proved, not as tests, but as mathematical theorems. The capstone theorem:

theorem zlib_decompressSingle_compress (data : ByteArray) (level : UInt8)
    (hsize : data.size < 1024 * 1024 * 1024) :
    ZlibDecode.decompressSingle (ZlibEncode.compress data level) = .ok data

This is a machine-checked proof that decompressing a compressed buffer always returns the original data, at every compression level, for the full zlib format. The size precondition (hsize) is a proof engineering artifact: it bounds the fuel the AI used during proof construction. A human engineer would likely eliminate it; the AI, working without guidance, chose a bound that was sufficient to close the proof. As both AI capabilities and platform tooling improve, artifacts like this will disappear. The raw DEFLATE and gzip framing roundtrips are proved as well. Fourth, an optimized version is being developed and proved equivalent to the verified model. This work is ongoing.

The entire proof was constructed with minimal human guidance and no specialized tooling. The result demonstrates that AI can convert production software to a verified form today. This was not expected to be possible yet.

zlib is a sequential algorithm with a clean RFC specification, simpler than a database engine or a distributed system. The gap between this result and a verified software stack is real. But until recently, no one had demonstrated AI-generated proofs about production software. The trajectory matters more than the starting point. Crucially, the zlib proof was produced by a general-purpose AI with no specialized training for theorem proving. No custom model was needed. This means the barrier to verified software is no longer AI capability. It is platform readiness. As general-purpose AI improves, the bottleneck shifts entirely to the verification platform: how rich is the feedback it gives AI, how powerful is the automation, how large is the library of prior knowledge to build on.

The zlib workflow also reveals something testing structurally cannot do. When your tests and your code share the same wrong assumptions, testing finds nothing. The act of proving forces every assumption to be explicit: a designer states a property, the AI turns it into a precise formal statement, and then either constructs a proof or discovers that the specification needs revision. Wrong assumptions surface as proof obligations that cannot be discharged. Specifications become sharper through the attempt to prove them.

The approach extends beyond sequential algorithms to distributed systems, where the hardest bugs live. Ilya Sergey's group at NUS built Veil, a distributed protocol verifier on top of Lean that combines model checking with full formal proof. When a property does not hold, Veil generates concrete counterexamples: actual execution traces that demonstrate the failure, just as model checkers do. When it does hold, Veil produces a full formal proof, not just the absence of counterexamples up to some bound. Veil has verified protocols including Rabia, a randomized consensus protocol, proving agreement and validity for any number of nodes. During verification, the team discovered an inconsistency in a prior formal verification of Rabia that had gone undetected across two separate tools. For simpler protocols, Claude Code can infer most of the required auxiliary invariants within minutes.

The broader picture confirms the trajectory. AI systems using Lean have solved open mathematical problems that resisted decades of human effort. A single mathematician working with an AI agent formalized the full Prime Number Theorem in three weeks: 25,000 lines of Lean, over 1,000 theorems. The previous formalization took over a year and dozens of contributors. Fields Medalist Maryna Viazovska's proof that the E8 lattice is the optimal sphere packing in eight dimensions has been formally verified in Lean, with an AI agent completing the final steps. Google DeepMind's AlphaEvolve discovered a new mathematical construction, and AlphaProof formalized the proof in Lean, a complete AI pipeline from discovery to verified proof.

The productivity gap is widening: teams with the best tools are pulling further ahead while others stagnate. Verification will be a decisive advantage.

The Verified Stack

Where this leads is clear. Layer by layer, the critical software stack will be reconstructed with mathematical proofs built in. The question is not whether this happens, but when.

It is already beginning. The ecosystem is growing. Startups are training AI on proof data, building verification tools, and growing talent. This is phase one: verifying existing code using Lean. Phase two is reconstruction: building the stack in Lean from the ground up, with proofs built in from the start. Both phases happen on the same platform. Both strengthen it.

The target is the foundation of the modern software stack: cryptography, because everything else trusts it. Core libraries (data structures, algorithms, compression) because they are the building blocks of all software. Storage engines like SQLite, embedded in every device on earth. Parsers and protocol implementations (JSON, HTTP, DNS, certificate validation) because every message passes through them. And compilers and runtimes, because they build everything else.

Consider SQLite, embedded in every smartphone, every browser, every major operating system. A verified SQLite carries proofs that a crash during a write cannot corrupt the database, that concurrent readers never see partial transactions. Testing tools like Jepsen and stateful property-based testing find many such bugs in practice, but they provide confidence, not a guarantee. Mathematical proof covers every crash point and every interleaving.

Each verified component is a permanent public good. Unlike proprietary software, a verified open-source library cannot be degraded, cannot have its guarantees quietly revoked, and cannot be held hostage by a single company's business decisions. The proofs are public. Anyone can audit them, build on them, or replace the implementation while preserving the guarantees. This is infrastructure in the deepest sense: it raises the floor for everyone. When you build on this stack, you inherit the proofs. Each layer is independent. The platform and methodology come first; then the work parallelizes naturally.

Once verified components are cheap, you compose them with confidence. Today, integration is where most bugs live: the boundaries between components, the assumptions that do not quite match. Integration testing exists precisely because component tests do not compose: passing unit tests on two components tells you nothing about whether they work together. Verified interfaces are fundamentally different. When each component carries a proof against a shared specification, the composition is guaranteed correct by construction. You can verify each piece independently and know the whole is sound. This advantage scales superlinearly: the larger the system, the wider the gap between tested and verified.

The World This Creates

The goal is a verified software stack: open source, freely available, mathematically guaranteed correct. Developers building critical systems choose verified components the way they choose open-source libraries today, except these carry proofs, not just tests. No single team builds this alone, and no single platform is guaranteed to be the right one forever. What matters is that verified software becomes the norm. If something better than Lean emerges, the mission has still succeeded. It is an open challenge for the entire community.

The role of the engineer changes, but it does not shrink. Engineers spend more time writing specifications and models, designing systems at a higher level of abstraction, defining precisely what systems must do, what invariants they must maintain, what failures they must tolerate. The work becomes more creative, not less: designing before generating, thinking before building. Productivity comes not from generating more code, but from generating code that is provably correct on the first attempt.

Specifications also raise a question that is not purely technical: who decides what "correct" means? A specification for a medical device, a voting system, or an AI safety monitor encodes values, not just logic. Making specifications formal and public does not resolve these questions, but it makes them explicit and auditable in a way that buried code never can. AI itself can bridge the accessibility gap: the same systems that generate verified code can explain specifications in plain language, making formal guarantees accessible beyond the experts who write them.

The AI industry has a strategic stake in this outcome. An AI that generates provably correct code is qualitatively different from one that merely generates plausible code. Verification transforms AI code generation from a productivity tool into a trust infrastructure.

AI is going to write a great deal of the world's software. It will advance mathematics, science, and engineering in ways we cannot yet anticipate. The question is whether anyone can prove the results correct.

---

Lean is open source and freely available at lean-lang.org. To learn Lean, start with our documentation. To join the community, visit the Lean Zulip. The Lean Focused Research Organization builds and maintains the platform. If you want to support this work, contact us at contact@lean-fro.org.