惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Y
Y Combinator Blog
V
Visual Studio Blog
Cloudbric
Cloudbric
Project Zero
Project Zero
Security Latest
Security Latest
Microsoft Azure Blog
Microsoft Azure Blog
A
Arctic Wolf
Cyberwarzone
Cyberwarzone
Cisco Talos Blog
Cisco Talos Blog
爱范儿
爱范儿
The GitHub Blog
The GitHub Blog
AWS News Blog
AWS News Blog
有赞技术团队
有赞技术团队
腾讯CDC
T
Threatpost
Simon Willison's Weblog
Simon Willison's Weblog
WordPress大学
WordPress大学
博客园 - 叶小钗
The Cloudflare Blog
Know Your Adversary
Know Your Adversary
M
MIT News - Artificial intelligence
Scott Helme
Scott Helme
K
Kaspersky official blog
MongoDB | Blog
MongoDB | Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
博客园 - 三生石上(FineUI控件)
L
LINUX DO - 最新话题
B
Blog RSS Feed
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
H
Heimdal Security Blog
PCI Perspectives
PCI Perspectives
G
Google Developers Blog
博客园 - 司徒正美
B
Blog
C
Cisco Blogs
L
LINUX DO - 热门话题
大猫的无限游戏
大猫的无限游戏
Google DeepMind News
Google DeepMind News
W
WeLiveSecurity
V
V2EX - 技术
N
News and Events Feed by Topic
Microsoft Security Blog
Microsoft Security Blog
AI
AI
The Last Watchdog
The Last Watchdog
博客园_首页
H
Hacker News: Front Page
H
Help Net Security
F
Fortinet All Blogs
P
Proofpoint News Feed
Hugging Face - Blog
Hugging Face - Blog

Hacker News: Front Page

Trump administration reclassifies cannabis as less dangerous Release raylib v6.0 · raysan5/raylib GitHub - russellromney/honker: SQLite extension + bindings for Postgres NOTIFY/LISTEN semantics with durable queues, streams, pub/sub, and scheduler Writing a C Compiler, in Zig crawshaw - 2026-04-22 MacBook Neo and How the iPad Should Be It's time to reclaim the word "Palantir" for J.R.R. Tolkien Arch Linux now has a bit-for-bit reproducible Docker image Fundamental Theorem of Calculus | David Álvarez Rosa | Personal Website Bring Your Agent to Teams Ars Technica newsroom AI policy France confirms data breach at government agency that manages citizens’ IDs New study compares growing corn for energy to solar production. It's no contest NAEP Long-Term Trend Assessment Results: Reading and Mathematics Convergent Evolution: How Different Language Models Learn Similar Number Representations We found a stable Firefox identifier linking all your private Tor identities GitHub - besimple-oss/broccoli: Broccoli turns Linear tickets into shipped PRs — powered by Claude and Codex, running on your own Google Cloud. Youth Suicides Declined After Creation of National Hotline Top MAGA influencer revealed to be AI — created by a guy in India who made a mint off lonely men online Ping-pong robot beats top-level human players Announcing DuckDB 1.5.2 The handmade beauty of Machine Age data visualizations Treetops glowing during storms captured on film for first time Columnar Storage is Normalization TPU 8t and TPU 8i technical deep dive Our eighth generation TPUs: two chips for the agentic era Introducing Google Cloud Fraud Defense, the next evolution of reCAPTCHA Kernel code removals driven by LLM-created security reports tante.cc Nobody Got Fired for Uber's $8 Million Ledger Mistake? Introducing workspace agents in ChatGPT Sure, xor’ing a register with itself is the idiom for zeroing it out, but why not sub? What Async Promised and What it Delivered — Causality GitHub - justrach/kuri: Browser automation and web crawling for AI agents. Zig-native, token-efficient CDP snapshots, HAR recording, and a standalone fetcher. Drunk Post: Things I’ve Learned as a Senior Engineer Claude Code to be removed from Anthropic's Pro plan? Another Day Has Come 'Something sinister could be happening': FBI looks into dead or missing nuclear and space defense scientists tied to NASA, Blue Origin, and SpaceX | Fortune GitHub - calcom/cal.diy: Scheduling infrastructure for absolutely everyone. Meta to start capturing employee mouse movements, keystrokes for AI training The Vercel Breach: OAuth Supply Chain Attack Exposes the Hidden Risk in Platform Environment Variables Member of Technical Staff, Product Engineering (full-time) at Trellis AI | Y Combinator CATL's new LFP battery can charge from 10 to 98% in less than 7 minutes Jobs at Bloom | Y Combinator The printing press for biological data (Sterling Hooten) Brussels launched an age checking app. Hackers took 2 minutes to break it Inside GitHub's Fake Star Economy The Illuminated Man by Christopher Priest and Nina Allan review – an unconventional portrait of JG Ballard IEA: Solar overtakes all energy sources in a major global first Stripe’s payments APIs: The first 10 years GitHub - esutcu/planb-lpm GitHub - browser-use/browser-harness: Self-healing browser harness that enables LLMs to complete any task. Claude Token Counter, now with model comparisons GitHub - shivampkumar/trellis-mac Six levels of dark mode The Bromine Chokepoint: How Strife in the Middle East Could Halt Production of the World’s Memory Chips Turtle WoW classic server announces shutdown after Blizzard wins injunction Scoring 500 Show HN pages for AI design patterns Vercel April 2026 security incident | Vercel Knowledge Base Dubai police arrest airline worker after accessing private WhatsApp group Prompt → Diagram — Gemma 4 E2B in desktop Chrome (WebGPU) Binary GCD - Algorithmica madhadron - The seven programming ur-languages Keep Pushing: We Get 10 More Days to Reform Section 702 The world in which IPv6 was a good design Zero-Copy GPU Inference from WebAssembly on Apple Silicon The RAM shortage could last years Any Color You Like: NIST Scientists Create ‘Any Wavelength’ Lasers in Tiny Circuits for Light Optimizing Ruby Path Methods A college instructor turns to typewriters to curb AI-written work and teach life lessons UpCodes | Careers The electromechanical angle computer inside the B-52 bomber's star tracker Why Japan has such good railways - Works in Progress Magazine State of Kdenlive - 2026 GitHub - smol-machines/smolvm: Tool to build & run portable, lightweight, self-contained virtual machines. Head of Engineering at Kyber | Y Combinator GitHub - paniclock/paniclock: Instantly disable Touch ID and lock your Mac with one click or keyboard shortcut. Detecting DOSBox from within the Box I Measured Claude 4.7's New Tokenizer. Here's What It Costs You. Introducing Claude Design by Anthropic Labs Middle schooler finds coin from Troy in Berlin It Is Time to Ban the Sale of Precise Geolocation Isaac Asimov: The Last Question Teddy Roosevelt and Abraham Lincoln in the same photo Healthchecks.io Now Uses Self-hosted Object Storage Bluesky has been dealing with a DDoS attack for nearly a full day. Harness Engineer at Substrate | Y Combinator GitHub - dacracot/Klondike3-Simulator SPICE simulation → oscilloscope → verification with Claude Code — Lucas Gerads Email could have been X.400 times better Newly unsealed records reveal Amazon’s price-fixing tactics, California attorney general claims GitHub - GainSec/AutoProber: Hardware hacker’s flying probe automation stack for agent-driven target discovery, microscope mapping, safety-monitored CNC motion, probe review, and controlled pin probing. A Better R Programming Experience Thanks to Tree-sitter Clojure - Documentary GPT‑Rosalind for life sciences research How a Tiny Yellow Handheld Changed How Duke University Teaches Game Design - Playdate News Android CLI and skills: Build Android apps 3x faster using any agent Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7 Codex for almost everything GitHub - GRVYDEV/marky: A lightweight easy to use markdown viewer
Formal methods and the future of programming
2026-06-14 · via Hacker News: Front Page

I’ve been telling people for the last 25 years that Jane Street as an organization was just not interested in formal methods.

I’m not saying that anymore.

It’s not exactly that I think we were wrong all those years. To be clear, we’re strong believers in the power of tools to help us write better and more reliable code. And type systems are a kind of lightweight formal method that we’ve gotten an enormous amount of benefit from. So you might expect us to have been big believers in more full-on formal methods.

But outside of some special cases (notably, hardware synthesis), our sense has been that formal methods were just not worth the costs for us. And those costs are really high! seL4 is a great example of this. It’s a formally verified microkernel, and a profound achievement. But, boy was it expensive to do! It took 25 person-years of effort to verify 8,700 lines of C, and each line of code required something like 23 lines of proof and a half a person-day to verify.

Our hope is to make formal methods as pervasively useful of a tool for building software as sophisticated type systems are for us today.

That kind of approach could be worth it for a security-critical microkernel, where the stakes are high and the specifications are fairly clear. But it just doesn’t make sense for most software, and to us it didn’t feel like it made sense for even our most critical software.

But the emergence of agentic coding has changed our perspective, and we’ve gone from being skeptical to being excited about the possibilities. And as a result, we’re now building a team to focus on formal methods. Our hope is to make formal methods as pervasively useful of a tool for building software as sophisticated type systems are for us today.

Why the change of heart?

Agentic coding upsets the formal-methods apple-cart in a few ways.

For one thing, it dramatically changes the cost of using formal methods. It’s not that agents can on their own construct arbitrarily challenging proofs.1 But models are enormously helpful, and broaden the set of people who can use these tools productively. With formal methods being easier to use than ever, it’s worth reconsidering the old cost/benefit calculus.

But things haven’t changed only on the cost side. The benefits seem bigger now too. There are really two reasons for this:

The verification bottleneck is more important than ever. Models are increasingly good at writing useful code. But there’s a big gap between the code that models generate, and code that you’d want to actually release. To some degree, this is an artifact of how the models are trained. They’re surprisingly good at achieving the goal you set in front of them, but they don’t do a great job of maintaining and even improving the quality of the codebase as they do so. Agentic code is getting better, but is still tends towards slop: overly complicated, full of weird bugs and corner cases, often not following essential invariants of the codebase that it’s a part of.

As a result, people need to spend a lot of time verifying that the code produced by agents is up to snuff. And formal methods could be a way of relieving some of that verification burden, and making the process of review a lot more efficient.

Separately, agents thrive on feedback. This is true both when you’re training agents using RL, and when you’re using agents to code. And formal methods are another powerful form of feedback that can increase the agents’ ability to solve hard problems.

A lot of why we're excited about full-on formal methods is that we see how valuable types are when programming with agents.

Not that formal methods are the only way of getting feedback. Tests are incredibly valuable as well, and can be made even better by leaning into property-based tests and fuzzing. And lord knows we’ve spent a lot of time building out testing infrastructure.

But tests aren’t enough! There are inherent limits in the power of tests to cover the state space that your program might explore. One of the things we’ve seen in our own programming in OxCaml is that agents benefit a ton from universal guarantees, the ∀ you get out of type systems. If your type system has a way of preventing data races, it lets you get rid of all2 data races. If you set up your types to make cross-site scripting vulnerabilities impossible, then you can really get rid of those entirely, in a way that mere testing has trouble doing.

Indeed, a lot of why we’re excited about full-on formal methods is that we see how valuable types are when programming with agents, both for easing the verification bottleneck and providing agents with better feedback, and that makes us excited to see how much more uplift could be available by leveraging more powerful proof techniques.

We have two things going for us: deep control of the language we're using, and a community of programmers who are ready for this.

Why do it here?

One question this raises is: why is Jane Street well positioned to work on this problem? The whole world is thinking about what agents mean for the future of programming, and there are endless startups looking for ways of mixing formal methods and agents. Why is this something we’d work on internally? And why should formal methods experts in the outside world be excited to join our efforts here?

For one thing, we have deep control of the language we’re using, and that lets us adjust that language to make it a better home for proof-oriented techniques. There are lots of potential directions to go here: from integrating modular specifications of properties into the type system, to adding type-level constraints around things like ownership and mutability to make certain kinds of proofs easier, to building proof techniques directly into the language.

We also have a community of programmers who are ready for this, or at least more ready than any serious programming community I’ve encountered. For most people who work on programming languages, the easy part is coming up with new and better ideas about how to make programming better. The hard part is convincing anyone to actually use those ideas for real work.

At Jane Street, things are different! We routinely have users angry at us because the new, weird type-system features we promised them aren’t coming fast enough. We have a lot of people with the right background to leverage these techniques, and a lot of baked in interest in getting things right and building high-quality software.

We think that user base will gives us the freedom to try a mixture of approaches; there are some near-term improvements we think we can make which will have pretty immediate impact, and some ambitious, longer-term visions for where we can get in a few years. Having an engaged and excited user base makes both of these approaches possible, and lets us learn from the first, while we build towards the second.

None of this is to say that we’re going to ignore work in the outside world. We’re excited and inspired by the work in a variety of other PL communities, built around tools like Lean, Dafny, Rocq, Agda, Iris, and too many more to mention. And we’re excited to look for ways of integrating OxCaml with some of these tools, to take advantage of the great infrastructure that’s already out there. But we also think there are some unique advantages that can only be realized by engaging with the language and the proof techniques at the same time.

Join us!

If this sounds interesting to you, consider applying! We’re looking for people in both London and New York. We’re in the early stages of interviewing people for these spots and building a team, and there’s an enormous amount of work ahead of us, and we’d love you to be a part of it.

Yaron Minsky joined Jane Street back in 2002, and claims the dubious honor of having convinced the firm to start using OCaml.