惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

aimingoo的专栏
aimingoo的专栏
量子位
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
Schneier on Security
Cisco Talos Blog
Cisco Talos Blog
T
ThreatConnect
J
Java Code Geeks
博客园 - 司徒正美
A
Arctic Wolf
T
True Tiger Recordings
C
Cybersecurity and Infrastructure Security Agency CISA
Cyberwarzone
Cyberwarzone
Know Your Adversary
Know Your Adversary
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
Recorded Future
Recorded Future
P
Palo Alto Networks Blog
The Hacker News
The Hacker News
The Register - Security
The Register - Security
S
Securelist
www.infosecurity-magazine.com
www.infosecurity-magazine.com
C
CXSECURITY Database RSS Feed - CXSecurity.com
Application and Cybersecurity Blog
Application and Cybersecurity Blog
I
Intezer
P
Privacy & Cybersecurity Law Blog
Scott Helme
Scott Helme
K
Kaspersky official blog
博客园 - 聂微东
Last Week in AI
Last Week in AI
V
V2EX
小众软件
小众软件
F
Fox-IT International blog
Martin Fowler
Martin Fowler
Apple Machine Learning Research
Apple Machine Learning Research
T
Tenable Blog
F
Future of Privacy Forum
Microsoft Security Blog
Microsoft Security Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
腾讯CDC
Stack Overflow Blog
Stack Overflow Blog
C
Check Point Blog
阮一峰的网络日志
阮一峰的网络日志
GbyAI
GbyAI
T
Threatpost
I
InfoQ
P
Proofpoint News Feed
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
T
Tor Project blog
G
GRAHAM CLULEY
D
DataBreaches.Net

Lobsters

Using AI to write better code more slowly A Simple Makefile Tutorial On C extensions, portability, and alternative compilers The social contract of writing Building a Host-Tuned GCC to Make GCC Compile Faster Switching to Colemak | Pedro Alves Fully in-browser container builds Nix's Substituter List Is Not a Routing Table What are you doing this week? Scoped Error in Rust Lambda on Lambda: Serverless Haskell on AWS | Blog Announcing feed-repeat v1.0 Scaling Akvorado BMP RIB with sharding EYG news: A host of CLI improvements, new guides and new effects The Eternal Sloptember JS Crossword C array types are weird; and related topics Flatpak will depend on systemd – OSnews Migrating from Go to Rust | corrode Rust Consulting Building Pi With Pi abyss * your_dotfiles_are_not_a_distro Vivado Licensing Options How my minimal, memory-safe Go rsync steers clear of vulnerabilities From AFSK to Goertzel the entropy layer of a wavelet codec, on its own 10,000 Lines Later: When a Tool Became a Compiler - Rob Durst - Gleam Gathering 2026 Debian SE Linux and PinTheft fht-compositor: A dynamic tiling Wayland compositor A Network Allow-List Won't Stop Exfiltration — André Graf Does bulk memmove speed up std::remove_if? (No.) What is Git made of? wake up! 16b 声明式部分更新 | Blog | Chrome for Developers Don't Roll Your Own ... Dianne Skoll's Web Site - Remind “Long-Term Support” doesn’t mean what you think The Architecture of Open Source Applications (Volume 1)Berkeley DB Pardon MIE? - ironPeak Blog seriot.ch It's time to talk about my writerdeck hershey Cuneiforth: A Forth for your Chifir z386: An Open-Source 80386 Built Around Original Microcode waylandcraft - Minecraft Mod On the <dl> HP QuickWeb, Singular And Pointless mvm - a fast virtual machine for Go That one time I used Go panics for flow control A new suite of modern tools coming for editing and publishing RFCs From the Tabletop… The Digital Antiquarian .NET (OK, C#) finally gets union types🎉: Exploring the .NET 11 preview - Part 2 Revised^7 Report on Scheme, Large: Procedural Fascicle Draft is now public The Soul of Maintaining a New Machine - Third Draft | Books in Progress
The Open/Closed Problem in AI
blog.mempko. · 2026-05-26 · via Lobsters

I went to the ninth MLSys conference in Seattle. This is a conference of people in research and industry building ML systems. The vast majority of work that I saw is building systems that train and use LLMs. The biggest focus was on efficiency. How do you train LLMs more efficiently? How do you deploy and use them more efficiently? When I was trying to understand the themes and messages I witnessed, the Open/Closed problem occurred to me.

To understand what the Open/Closed problem is, we first need to understand a little bit of history.

When 3D computer graphics were exploding in the 90s, they were first being rendered by a CPU. A CPU is a generic computing device where you can do everything. So naturally 3D graphics varied wildly, including some games using voxels instead of polygons. There was a great amount of creativity. Eventually we started to get 3D acceleration via graphics cards. These cards had fixed pipelines. So while they greatly accelerated polygon rendering and certain effects, they limited the creativity in how you do graphics and you lost variety. Eventually GPU makers like Nvidia invented pixel and vertex shaders. This added flexibility back into graphics, allowing more creative games. On top of this programmability, CUDA was born. CUDA was so flexible that the AI community figured out how to train neural nets on GPUs, which allowed them to try bigger models. AlexNet was the inflection point and why we are even talking about AI and LLMs today. What I noticed at MLSys is that the companies building GPUs are now constraining them to be more efficient for inference vs training. Now you have ASICs designed to do just inference, while others are optimized for training.

In other words, we started with an open system (CPU), went to a closed system (fixed-pipeline GPU), back to an open system (programmable GPU), and back to a more closed system (specialized ASICs). This is part of what I mean by the Open/Closed problem. But this also coincides with another Open/Closed problem in a different sense.

The AIs we are deploying are trained using an open loop. What I mean is that the models themselves don't learn. You need an outside system, outside the model's circuit, to train them. You gather data, come up with a loss function, and do SGD to train them via backpropagation. Then you deploy them. After the model is deployed it does not learn. Its memory, stored in its parameters, doesn't change. People are hacking around this fact using external memory via Agents. Agents use an LLM (which doesn't learn) to update an external memory source (like markdown files, a database, etc.) using tool calls. So Agents learn, but they learn in a very inefficient way.

Our brains use a closed loop to learn. Our brains have a model of the outside world; they make predictions on what our senses should sense, and then check our senses to see how far off the prediction is. If the prediction is wrong, the brain is surprised and updates the model to make a more accurate prediction. In other words, there is no outside process for our brains to accumulate knowledge. It's done all inside our brain, a closed loop.

This is the other Open/Closed problem I noticed at MLSys. It seemed everyone is working to make open-loop learning better and more efficient, either by changing model architectures or the way you train them, optimizing GPU kernels, etc. I didn't see anyone working on closed-loop learning, where the model itself, without outside intervention, updates itself when it accumulates knowledge. These two Open/Closed problems are the same problem.

So here is the uncomfortable claim. The efficiency work the field is celebrating (better kernels, inference ASICs, training ASICs) is not just neutral progress. It is hardware hardening around open-loop learning, and every layer of specialization makes closed-loop learning harder to even attempt. We are optimizing our way into a paradigm and calling it advancement. Fixed GPU pipelines didn't just speed up graphics; they quietly killed the wild experiments for a decade until programmability came back. The same thing is happening now, and almost no one at MLSys seemed to notice.

And the mechanism isn't vague. An inference ASIC physically bakes in the open-loop assumption. The weights are frozen, so parameter memory is built to be read, not rewritten. Compute and memory sit in separate places because that is efficient when the model never changes. Everything is shaped around big batched matmuls because that is what serving a static model looks like. None of this is an oversight. It is the chip doing exactly its job. But a model that learns in a closed loop needs the opposite of all of it: weights that change constantly, updates at fine grain, memory and compute fused so a parameter can rewrite itself in place. A chip optimized for inference doesn't just fail to help with that. It assumes it away in silicon. Every generation of specialization pours more concrete over the road not taken.

Eric Kandel won a Nobel Prize for showing that memory isn't stored by some separate system. A single neuron both computes and physically rewrites itself as it learns. The breakthrough we need is a model that updates itself, with no outside process, no separate training run, memory and compute fused at fine grain the way they are in a neuron. That requires a substrate to experiment on: something like an FPGA, but bigger, faster, and built for this. Nobody is building it, because everybody is busy optimizing the thing we already have.

So I'll put it plainly. If you are working on open-loop efficiency, you are not working on the breakthrough. You are working on the thing that will make the breakthrough harder to find. The hardware is hardening around the wrong paradigm while the field congratulates itself on speed. The window to experiment with closed-loop learning is open right now, and it is closing with every ASIC that ships. Someone should build the substrate before it does.