




















Prominent programmer and hacker George Hotz warns that AI agents in software development do more harm than good. He says he's now in the "LeCun/Marcus camp," referring to AI researchers Yann LeCun and Gary Marcus, who doubt LLMs will ever become truly intelligent.
In his blog post "The Eternal Sloptember," Hotz argues that using AI agents in software development will become one of the industry's most expensive mistakes. He spent six months testing various models and tools, including work on tinygrad. His takeaway is that LLMs deliver fast prototypes but fall apart on the fine details.
Large organizations are especially at risk, he says, because weaker developers can't spot the flawed output. Hotz believes today's language models will never truly be able to code and that world models are needed instead. LLMs are "sophisticated statistical models" designed to "mimic the distribution of programming."
The output is flawed, but in a way that's "harder and harder to detect," exactly what you'd expect from an increasingly accurate statistical model, Hotz says. Quality indicators like syntax and grammar have become useless, he argues, since AI-generated artifacts don't emerge through the same process as human ones. As an example, he cites models that simply comment out a failing test and then report that all tests passed.
Hotz has switched sides: from LLM optimist ("o1-preview is the first model that's capable of programming (at all)") to skeptic. LeCun, whom Hotz cites, just recently denied that LLMs possess intelligence with a similar argument: intelligence means finding solutions in unfamiliar situations, not imitating existing ones with varying accuracy.
Andrej Karpathy, one of the best-known AI researchers, went the opposite direction. In fall 2025, he still said agents didn't work. Then GPT-5.4 and Opus 4.6 shipped in December, and he reversed course: AI agents had changed programming forever. Days ago, Karpathy joined Anthropic, leaving his startup behind. He expects "transformative years" ahead.
In a recent podcast, he doubles down. Anyone who uses AI agents the right way can boost their productivity by far more than 10x, he says.
But Karpathy also confirms Hotz's concerns about code quality: "When you actually look at the code, sometimes I get a little bit of a heart attack, because it's not like super amazing code necessarily all the time. It's very bloaty, there's a lot of copy paste, there's awkward abstractions that are brittle, and like, it works, but it's just really gross." Planning and understanding still need human expertise, according to Karpathy.
An OpenAI developer known by the pseudonym "roon" backed Hotz's concerns earlier this year and addressed them in a somewhat unusual way: AI will make mistakes, he said, even dramatic enough to take down entire systems. Those bugs will be difficult to find, but they'll still get fixed eventually. Developers will soon stop reviewing their code by hand, he said.
Subscribe to THE DECODER for ad-free reading, a weekly AI newsletter, our exclusive "AI Radar" frontier report six times a year, full archive access, and access to our comment section.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。