惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Simon Willison's Weblog
Simon Willison's Weblog
P
Privacy International News Feed
www.infosecurity-magazine.com
www.infosecurity-magazine.com
T
Troy Hunt's Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
Attack and Defense Labs
Attack and Defense Labs
S
Secure Thoughts
V2EX - 技术
V2EX - 技术
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
O
OpenAI News
Cloudbric
Cloudbric
Google Online Security Blog
Google Online Security Blog
Schneier on Security
Schneier on Security
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Help Net Security
Help Net Security
Cyberwarzone
Cyberwarzone
G
GRAHAM CLULEY
L
Lohrmann on Cybersecurity
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Spread Privacy
Spread Privacy
NISL@THU
NISL@THU
N
News and Events Feed by Topic
T
Tenable Blog
S
Security @ Cisco Blogs
N
News and Events Feed by Topic
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
宝玉的分享
宝玉的分享
月光博客
月光博客
酷 壳 – CoolShell
酷 壳 – CoolShell
美团技术团队
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google DeepMind News
Google DeepMind News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Tailwind CSS Blog
V
Visual Studio Blog
P
Proofpoint News Feed
Webroot Blog
Webroot Blog
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
博客园 - 三生石上(FineUI控件)
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Jina AI
Jina AI
雷峰网
雷峰网
T
The Blog of Author Tim Ferriss
Hugging Face - Blog
Hugging Face - Blog
腾讯CDC
L
LangChain Blog
The Register - Security
The Register - Security
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 聂微东

Yi's Blog

Solving Jane Street's 'Dropped a Neural Net' Puzzle HRM Explained: A 27M Parameter Model That Reasons Without Chain-of-Thought BrushNet & BrushEdit Explained: From Inpainting Architecture to Intelligent Editing U-Net Explained: A Visual Guide for Beginners Building an Image Captioning Transformer from Scratch Building a Language Transformer Step by Step Reverse Engineering Guitar Pro 8's Locked Files Vibe Coding - Extracting Pet Sprites from Cross Gate Breaking Up with Evernote: Building a Custom Migration Tool for Apple Notes 《世上为什么要有图书馆》读书笔记 - Yi's Blog 《纳瓦尔宝典》推荐阅读 - Yi's Blog 与冰山交谈 - Yi's Blog 微信读书:LLM 自动化问答 PK - Yi's Blog Working on Moonshot Projects - Yi's Blog Vibe Coding - Baby Sleep Tracker 独立思考的人 - Yi's Blog Magic Moment - Yi's Blog 《思辨力35讲:像辩手一样思考》读书笔记 - Yi's Blog Daily Watched YouTube Videos - Yi's Blog
Claude Code Complexity: Safety, Safety, Safety
Yi · 2025-06-27 · via Yi's Blog

I tried Claude Code this week, and instantly felt the empowerment from the tool, and was stunned by how naturally it blends into developer workflows.

It demonstrated how easy the LLM model makers can disrupt the application makers (Cursor in this case). This reminds me of the analogy Andrej Karpathy made in Software Is Changing (Again) presentation that LLM has strong analogies to operating systems. The LLM model makers can easily disrupt app makers like Apple can sherlock other softwares running on top of macOS.

With a similar tool from Google called Gemini CLI released, I begin to question about what is the main complexity Claude Code has, and whether that complexity is challenging enough to support companies relying on building agentic tools.

I found the following video where Boris Cherny (who is the creator of Claude Code) answered my first question:

Audience: I was wondering what was the hardest implementation, like part of the implementation for you of building it?

Boris: I think there’s a lot of tricky parts. I think one part that is tricky is the things that we do to make bash commands safe. Bash is inherently pretty dangerous and it can change system state in unexpected ways. But at the same time, if you have to manually approve every single bash command, it’s super annoying as an engineer.

Boris: … the thing we landed on is there’s commands that are read-only, there’s static analysis that we do in order to figure out which commands can be combined in safe ways, and then we have this pretty complex tiered permission system so that you can allow list and block list commands at different levels.

This highlights a key insight: In agentic systems, safety isn’t an afterthought—it’s the core challenge.

How do we know if a command is safe to run? How can these tools predict the consequences of an action? Currently, the burden is shifted to the developer via permission dialogs. But eventually, developers will expect these tools to act more autonomously—without compromising safety.

For commands that only affect local environments, Docker might offer a partial solution. But many real-world use cases involve remote effects—like modifying a task in Linear or changing a GitHub label. These remote side effects raise thorny questions about trust, auditability, and failure handling.

After exploring Claude Code and Gemini CLI, I’m excited about where this space is headed. The next breakthroughs may come not just from smarter agents—but from safer ones.

– EOF –