惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

L
LINUX DO - 热门话题
Stack Overflow Blog
Stack Overflow Blog
B
Blog
WordPress大学
WordPress大学
Project Zero
Project Zero
P
Palo Alto Networks Blog
阮一峰的网络日志
阮一峰的网络日志
博客园 - 司徒正美
有赞技术团队
有赞技术团队
S
SegmentFault 最新的问题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
小众软件
小众软件
T
Tailwind CSS Blog
Forbes - Security
Forbes - Security
F
Full Disclosure
SecWiki News
SecWiki News
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
Hacker News: Ask HN
Hacker News: Ask HN
C
Check Point Blog
Microsoft Security Blog
Microsoft Security Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
F
Fortinet All Blogs
Cisco Talos Blog
Cisco Talos Blog
G
Google Developers Blog
J
Java Code Geeks
Google DeepMind News
Google DeepMind News
人人都是产品经理
人人都是产品经理
CTFtime.org: upcoming CTF events
CTFtime.org: upcoming CTF events
Recorded Future
Recorded Future
O
OpenAI News
Spread Privacy
Spread Privacy
MongoDB | Blog
MongoDB | Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
C
Cybersecurity and Infrastructure Security Agency CISA
S
Securelist
V
Vulnerabilities – Threatpost
Y
Y Combinator Blog
IT之家
IT之家
U
Unit 42
腾讯CDC
S
Security Affairs
C
Cisco Blogs
Schneier on Security
Schneier on Security
The Last Watchdog
The Last Watchdog
B
Blog RSS Feed
宝玉的分享
宝玉的分享
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
S
Security @ Cisco Blogs
Cyberwarzone
Cyberwarzone
T
The Blog of Author Tim Ferriss

Futurism

Disaster Strikes When Meme Coin Pays Man to Tattoo Its Name on His Forehead, But They Misspelled It Bitcoin Is Taking a Nasty Swim Why Is Sam Altman Teaming Up With Jared Leto, a Creep With Extensive Sex Abuse Allegations? Investors Flocking to Super-Anonymous Cryptocurrency Used for the Sketchiest Stuff Imaginable Four Financial Journalists Accused of Being Fake AI-Generated Puppets That Shill Crypto in Forbes, HuffPost, and More Huge Analysis Finds That the Average Person Is Getting Absolutely Hosed on Polymarket Eric Trump’s Crypto Company Is Falling Into Total Disaster If You Bet on Polymarket, This New Study May Cause You Physical Pain Sam Altman Caught in What May Be His Most Spectacular Lie Yet Bitcoin Developers Are Debating a Move That Could Send Crypto Markets Into a Tailspin The US Military Just Arrested One of Its Soldiers for Making Ghoulish Polymarket Bets, and It Shows How Deep the Moral Rot of Prediction Markets Really Goes In Article About Horrific Shooting That Killed Eight Children, Forbes Lets Readers Place Bets About Gun Control Google Says Showing Polymarket Bets on Google News Was a Mistake The New York Times Says It’s Identified the Creator of Bitcoin Iran Demanding Huge Bitcoin Payments to Pass Through Strait of Hormuz
Anthropic Was So Concerned About Its New Mythos-Based Model’s Power That It Lobotomized Its Ability to Improve Itself
Victor Tangermann · 2026-06-12 · via Futurism

A stylized photo illustration featuring Anthropic co-founder Dario Amodei.

Illustration by Tag Hartman-Simkins / Futurism. Source: Michael M. Santiago / Getty Images; Shutterstock

Sign up to see the future, today

Can’t-miss innovations from the bleeding edge of science and tech

Earlier this year, Anthropic refused to release its Mythos AI model to the public, saying it was simply too dangerous.

At the time, executives claimed the model was capable of punching through powerful cybersecurity safeguards, pointing at researchers who used it to discover thousands of vulnerabilities in widely-used open source code.

Months later, Anthropic was finally ready to go public with the model. On Tuesday, the Dario Amodei-led company announced a Mythos-powered model called Fable 5, which it claims is “safe for general use.”

However, new safeguards quickly frustrated AI researchers, who accused the company of intentionally lobotomizing Fable 5. The backlash was so fierce, Anthropic quickly made adjustments to the policy, as Wired reported on Wednesday, highlighting just how carefully the company is treading.

In its original announcement, Anthropic claimed the safeguards were designed to stop Fable 5 from improving itself, in “new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development.” Just days ahead of the launch, Anthropic released a report on “when AI builds itself,” a trend that “might increase the risks of humans losing control over AI systems.”

However, AI researchers were not impressed by Anthropic hamstringing its latest model’s abilities.

“Anthropic’s latest model will NOT help you if it thinks your ML research/ML engineering is interesting, and/or will secretly degrade its IQ so that the average engineer won’t notice,” AI research firm SemiAnalysis tweeted.

“We are already seeing Anthropic’s latest model’s moderation filters our GPU inference research and programming,” it added.

Other researchers accused Anthropic of using Fable 5 to “shadowban,” or quietly restrict the accounts, of AI researchers. According to the firm’s system card, interventions limiting requests for “frontier LLM development” will “not be visible to the user.”

This last concern, which could’ve effectively sabotaged anybody trying to train competing models by quietly bumping them down to less powerful models without their knowledge, proved controversial enough for Anthropic to change its mind.

“We’re changing Fable 5’s safeguards for frontier LLM development to make them visible,” the company told Wired in a statement. “We made the wrong trade-off and we apologize for not getting the balance right.”

“It felt like Anthropic was saying to the public, ‘We don’t trust anybody else to do AI research,” AI startup Prime Intellect research lead Will Brown told the publication. “We are the only ones who have to do AI research.”

It all comes in the context Anthropic calling for a global freeze on AI advances while discussing the dangers of “recursive self-improvement.” In other words, the company is making a lot of noise about a sci-fi-sounding possibility: that AI will start to rapidly improve itself, potentially escaping the control of its human creators.

Beyond limiting its ability to develop AI tools, Fable 5’s new safeguards also trigger when it encounters requests “related to cybersecurity, biology and chemistry, or distillation.” Distillation is effectively using machine learning to train a “student” model on the behavior and reasoning of a “teacher” model, a practice that has sparked its fair share of controversy.

Anthropic has already publicly griped about large-scale attempts to distill, or “extract” its underlying model — a hypocritical stance given its indiscriminate scraping of rights-protected content on the web to train its AI in the first place.

More on Anthropic: Anthropic Scared, Calls for Global Freeze on AI Advances