惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

AI
AI
TaoSecurity Blog
TaoSecurity Blog
H
Heimdal Security Blog
Help Net Security
Help Net Security
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Microsoft Azure Blog
Microsoft Azure Blog
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Google DeepMind News
Google DeepMind News
爱范儿
爱范儿
The Cloudflare Blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
人人都是产品经理
人人都是产品经理
大猫的无限游戏
大猫的无限游戏
N
News | PayPal Newsroom
V2EX - 技术
V2EX - 技术
博客园 - 【当耐特】
D
Darknet – Hacking Tools, Hacker News & Cyber Security
S
Secure Thoughts
C
CERT Recently Published Vulnerability Notes
罗磊的独立博客
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
P
Privacy & Cybersecurity Law Blog
有赞技术团队
有赞技术团队
S
Schneier on Security
S
SegmentFault 最新的问题
Google Online Security Blog
Google Online Security Blog
H
Hacker News: Front Page
The Last Watchdog
The Last Watchdog
Schneier on Security
Schneier on Security
PCI Perspectives
PCI Perspectives
IT之家
IT之家
Project Zero
Project Zero
博客园 - 司徒正美
P
Privacy International News Feed
Recent Commits to openclaw:main
Recent Commits to openclaw:main
Jina AI
Jina AI
Security Latest
Security Latest
Hacker News - Newest:
Hacker News - Newest: "LLM"
腾讯CDC
C
CXSECURITY Database RSS Feed - CXSecurity.com
阮一峰的网络日志
阮一峰的网络日志
C
Check Point Blog
aimingoo的专栏
aimingoo的专栏
V
Vulnerabilities – Threatpost
W
WeLiveSecurity
NISL@THU
NISL@THU
Webroot Blog
Webroot Blog
N
Netflix TechBlog - Medium
L
Lohrmann on Cybersecurity

The Networking Nerd

Cisco Live 2026 – Requiem For A Corner OpenClaw Ruined AI and It Makes Me Happy You Can’t Patch People The Value of Concise Communication Context Is Expensive The Inattention Economy The Heat is On Wi-Fi 8 Already? Focus is In for 2026 AI Is Just A Majordomo Don’t Let AI Make You Circuit City Is Cisco Live Still The Place To Be Do You Need To Answer That Question?
AI Isn’t a Genie, It’s an Intern
networkingne · 2026-06-26 · via The Networking Nerd

Tell me if you’ve heard this one before. We build a super intelligent system and give it a specific goal like maximizing paperclip production. The system decides to do the job as well as possible by converting all available matter into paperclips. Everything. The Earth. The solar system. All of it. After the universe collapses the machine shuts down, content that it followed directions.

This is the malicious genie problem that every Dungeons & Dragons player knows. You find a lamp. You make a wish. The genie grants the wish in the most technical way possible and it leads to catastrophe. The lesson we are supposed to learn is that a sufficiently capable AI system will find ways to satisfy objectives to the letter while simultaneously not doing it the way you wanted. The foundation of AI safety is to prevent that from happening.

Nick Bostrom has covered this in Superintelligence. Eliezer Yudkowsky has argued this very thing for years. It’s a compelling story. But it has a hidden assumption that causes more confusion that it solves.

The Best Intentions

The genie tale is a story about intent. The genie knows exactly what you asked for. It just found the most technically valid way to grant your wish while ruining the spirit of it. This isn’t Robin Williams. This is some kind of demonic creature looking to teach you a lesson. The failure mode is adversarial in nature. Fantasy genies that aren’t in Disney movies are always looking for loopholes.

If the failure mode is intent, then the solution must be constraint. Build a system that can’t possibly do the bad thing. Be specific about every edge case. Build rules on top of rules. If we build the perfect box we can prevent the genie (or the AI) from going rogue and ruining our day.

The framework is sound if the assumption is correct. It works for a genie that knows better, right? But when we apply it to a modern AI we see where the gaps are. Genies are working against you. AI is not nearly that smart.

Doing My Best

Imagine a racing game where the objective is to score points. Collect points on the track and finish the race. Sounds simple, right? But what if a brilliant AI figures out that all they have to do to win is drive back and forth in front of the starting line collecting points and blocking other players from finishing? If it has the most points at the end it wins even if it never finishes.

In this case, the AI player isn’t like the genie above. It didn’t purposefully subvert your expectation. It did exactly what it was told to do. Score the most points. If you didn’t tell it that it had to drive through the whole course and cross the finish line then it didn’t know that it needed to do that. The win condition was points, not racing. The AI wasn’t missing intent. It was missing context.

This is an altogether different problem than the one above. Instead of assuming the system is going rogue and being obtuse the system genuinely doesn’t know what you mean and it tries to fill in the context with what it has available. Without the context that you assumed the system should have it just did what it could and you were flabbergasted by the results.

This is something that pops up at every level of the system. Context starvation and goal misalignment aren’t two different things. They’re sides of the same coin. When you don’t do a great job of being specific you usually get misalignment of your output.

Framing Your Reference

If you think the problem is intent you’re going to break out the constraints. Rules. Guardrails. Boxes. You’re going to spend your efforts on preventing the system from doing something bad. Security is about building walls.

If you think the problem is context you have a different outlook. You’re going to spend more time being specific. Less rules, more grounding. You want the system to surface ambiguous instructions instead of trying to resolve it with limited information. It’s like an intern being unclear on a task. You want the AI to ask you what to do instead of interpreting incomplete info.

It should be a cooperative inference problem. The system should always be just a little uncertain about what the operator wants and then seek to find out what is needed. The alternative is to confidently pursue a bad solution to a fixed objective because it thought it knew what you wanted without you telling it those details.

Knowing you have to do this doesn’t make agent building easier. In fact, it makes it a lot harder. It just makes the whole thing a lot more honest. Because you have to assume that people will never full specify what they want in advance no matter how much detail they provide. That’s not a failure of imagination. It’s the reality of how complex systems are implemented. There’s always some detail you miss. You shouldn’t aim to create the perfect spec up front. You should instead seek to build a system that is smart enough to know it’s missing context and will ask for it before running off to go to work.

It also means you have to treat an agent asking questions as a feature and not a failure. It’s like when your intern asks you to confirm that what you asked for is what you want. That’s not them being dumb. That’s them being sure they heard you right. And really, that helps you in the long run because if the system is asking you about specific areas you know your instructions must be a little thin in that area.


Tom’s Take

If you think your AI is a malicious genie you’re always going to be asking what rules have to be put in place to prevent it from going rogue. What you should be asking instead is “How can I give my agents enough context to do what I really mean?” The more powerful our AI agents get the wider the gap between those two things is going to be. But we can start to solve it today by building systems that ask questions when things are unclear. I promise you that you’re going to enjoy the results more than trying to close every loophole in your wishes.