惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
WordPress大学
WordPress大学
小众软件
小众软件
Cloudbric
Cloudbric
AWS News Blog
AWS News Blog
腾讯CDC
量子位
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
V
Vulnerabilities – Threatpost
Scott Helme
Scott Helme
Hugging Face - Blog
Hugging Face - Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
The Hacker News
The Hacker News
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
IT之家
IT之家
Jina AI
Jina AI
Attack and Defense Labs
Attack and Defense Labs
S
SegmentFault 最新的问题
Simon Willison's Weblog
Simon Willison's Weblog
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tailwind CSS Blog
Last Week in AI
Last Week in AI
博客园 - 【当耐特】
Google Online Security Blog
Google Online Security Blog
美团技术团队
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
V
Visual Studio Blog
罗磊的独立博客
L
LINUX DO - 最新话题
博客园 - Franky
博客园 - 叶小钗
Apple Machine Learning Research
Apple Machine Learning Research
The Last Watchdog
The Last Watchdog
J
Java Code Geeks
AI
AI
C
Cisco Blogs
酷 壳 – CoolShell
酷 壳 – CoolShell
C
Cyber Attacks, Cyber Crime and Cyber Security
Cisco Talos Blog
Cisco Talos Blog
博客园 - 三生石上(FineUI控件)
雷峰网
雷峰网
Help Net Security
Help Net Security
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
云风的 BLOG
云风的 BLOG
I
Intezer
S
Securelist

Pierce Freeman

A browser for agents | Pierce Freeman The grey market of podcast appearances The way I travel | Pierce Freeman Fixing slow AWS uploads | Pierce Freeman Local tools should still use vaults Starting a podcast in 2025 Being late but still being early Automating our home video imports Adding my parents to tailscale A deep dive on agent sandboxes Language servers for AI | Pierce Freeman My simple home podcast studio We need centralized infrastructure | Pierce Freeman Coercing agents to follow conventions using AST validation My unified theory of social selling My personal backup strategy | Pierce Freeman July updates to the homelab How the KV Cache works httpx is the right way to do web requests in Python Reputation is becoming everything | Pierce Freeman Building a (kind of) invisible mac app Updated knowledge in language models Making an ascii animation | Pierce Freeman How speculative decoding works | Pierce Freeman Under the hood of Claude Code Doing things because they're easy, not hard Speeding up sideeffects with JIT in mountaineer Firehot for hot reloading in Python Misadventures in Python hot reloading How text diffusion works | Pierce Freeman The tenacity of modern LLMs The ergonomics of rails | Pierce Freeman How language servers work | Pierce Freeman Just add eggs | Pierce Freeman Unfortunately SEO still matters | Pierce Freeman The futility of human-only web requirements Setting up Input Leap | Pierce Freeman Checking in on Waymo | Pierce Freeman The react revolution | Pierce Freeman Speeding up many small transfers to a unifi nas Quick notes on swift libraries AI engineering is a different animal San Francisco | Pierce Freeman Debugging a mountaineer rendering segfault Local network config on macOS Building our home network | Pierce Freeman Introducing Envelope.dev Legacy code and AI copilots Typehinting from day-zero | Pierce Freeman Generating database migrations with acyclic graphs Lofoten | Pierce Freeman Mountaineer v0.1: Webapps in Python and React Constraining LLM Outputs | Pierce Freeman Passthrough above all | Pierce Freeman Accuracy in kudos | Pierce Freeman How quick we are to adapt The curious case of LM repetition Costa Rica | Pierce Freeman Debugging chrome extensions with system-level logging Speeding up runpod | Pierce Freeman Inline footnotes with html templates Parsing Common Crawl in a day for $60 An era of rich CLI All or nothing with remote work The Next 10 Years | Pierce Freeman Adding wheels to flash-attention | Pierce Freeman LLMs as interdisciplinary agents | Pierce Freeman New Zealand | Pierce Freeman Representations in autoregressive models | Pierce Freeman Let's talk about Siri | Pierce Freeman Minimum viable public infrastructure | Pierce Freeman Reasoning vs. Memorization in LLMs Automatically migrate enums in alembic Greater sequence lengths will set us free On learning to ski | Pierce Freeman Dolomites | Pierce Freeman Using grpc with node and typescript Opportunity years | Pierce Freeman Buzzword peaks and valleys | Pierce Freeman Buenos Aires | Pierce Freeman Network routing interaction on MacOS Independent work: November recap Debugging slow pytorch training performance The provenance of copy and paste Debugging tips for neural network training Patagonia | Pierce Freeman Santiago | Pierce Freeman My 2022 digital travel kit AWS vs GCP - GPU Availability V2 Independent work: October recap | Pierce Freeman Planning Patagonia Relationship modeling | Pierce Freeman The power of status updates A new chapter | Pierce Freeman Give my library a coffee shop AWS vs GCP - GPU Availability V1 Switzerland | Pierce Freeman Headfull browsers beat headless | Pierce Freeman Webcrawling tradeoffs | Pierce Freeman Copenhagen | Pierce Freeman
We solved scratch content first
2026-02-02 · via Pierce Freeman

One of my greatest surprises with scaling laws is we solved from-scratch image generation sooner than we solved editing.

We could have started with trying to encourage models to learn Photoshop; perhaps not by UI interaction but certainly by programmatic manipulation. Take a base layer, stack on transformations that are modeled as tool calls, and measure the output against the results that we want. In some ways this feels easier than pixel based generation. You're guaranteed that shapes are logically coherent if they were created with object primitives that themselves are logical (square, circle, etc).

Instead we decided to solve the whole enchilada.

Models still aren't particularly good at interacting with editing tools; I've tried to build systems that will iteratively modify images with tools.1 At best it ends up with a few simple svg shapes but no sophistication. The results are so painfully worse - longer and worse quality - than just using Nano Banana Pro to generate the whole thing.

It ended up being way easier to treat the problem as truly end to end. But fully generating pixels comes at the expense of being able to tweak the end product manually. I suspect that's part of why we see such a flood of corporate slop2. Their creators might see the flaws - might even not be happy with them - but they've satisfied the bar of good enough. After all if you really want to get 100% good you're going to have to restart from scratch and build it yourself. It's much easier to accept the 95% quality that's one shotted.

At the moment we're left with a strange inversion: infinite creative power, zero creative control. You can make anything, as long as you're willing to accept whatever comes out. I imagine eventually the tooling will catch up: but I think it's distinctly possible we end up with better pixel-editing tools (ie. infinite guideable revisions) before we end up with tool use control.

  1. Simon Willison is well known for his pelican test of each LLM, where he tries to get the LLM to generate him a pelican from scratch via svg code. They're pretty bad at doing this; but a 7B image model can easily generate something photorealistic. ↩

  2. There's always the concern that we only can identify poor AI, not good examples. There might very well be a ton more AI generated content online than we recognize because it's good enough to pass unnoticed. ↩