惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Google DeepMind News
Google DeepMind News
F
Fortinet All Blogs
阮一峰的网络日志
阮一峰的网络日志
Apple Machine Learning Research
Apple Machine Learning Research
爱范儿
爱范儿
WordPress大学
WordPress大学
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
J
Java Code Geeks
罗磊的独立博客
S
SegmentFault 最新的问题
V
V2EX
V
Visual Studio Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
美团技术团队
博客园 - 三生石上(FineUI控件)
Stack Overflow Blog
Stack Overflow Blog
Y
Y Combinator Blog
MyScale Blog
MyScale Blog
D
Docker
Google DeepMind News
Google DeepMind News
Blog — PlanetScale
Blog — PlanetScale
M
Microsoft Research Blog - Microsoft Research
Martin Fowler
Martin Fowler
S
Secure Thoughts
B
Blog
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Announcements
Recent Announcements
MongoDB | Blog
MongoDB | Blog
C
Cisco Blogs
C
CERT Recently Published Vulnerability Notes
T
True Tiger Recordings
GbyAI
GbyAI
P
Proofpoint News Feed
P
Privacy International News Feed
Jina AI
Jina AI
The Cloudflare Blog
I
Intezer
AWS News Blog
AWS News Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Archives - TechRepublic
NISL@THU
NISL@THU
The Register - Security
The Register - Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Palo Alto Networks Blog
S
Schneier on Security
L
LINUX DO - 热门话题
C
CXSECURITY Database RSS Feed - CXSecurity.com
Security Latest
Security Latest
C
Cybersecurity and Infrastructure Security Agency CISA

Malwarebytes

Microsoft Defender vulnerabilities are being exploited in the wild TikTok, YouTube, and Roblox face scrutiny, but age gates won’t fix child safety Catch spyware in the act with Windows Webcam Monitoring Fake malware-signing service Fox Tempest dismantled by Microsoft Firefox 151 packs big privacy upgrades into a small update Biometrics, diagnoses, and bank details exposed in major healthcare breach Facebook scam promises cheap Aldi meat boxes, steals payment info instead YouTube wants your face to fight deepfakes Microsoft is changing Edge’s plaintext password behavior A week in security (May 11 – May 17) AI is distorting the Holocaust (Lock and Code S07E10) Attackers replaced JDownloader installer downloads with malware Meta’s confusing new approach to chat privacy Why Malwarebytes blocks some Yahoo Mail redirects Deepfake sextortion forces schools to remove student photos from websites Texas sued Netflix over claims it secretly collected and sold users’ data May 2026 Patch Tuesday: no zero-days but plenty to fix Fake Claude search results lure Mac users into ClickFix attack 1 in 8 employees have sold company logins or know someone who has Stolen Canvas data was “returned” after hacker agreement, Instructure says Yarbo responds to robot flaws that could mow down their owners A week in security (May 4 – May 10) Microsoft says Edge’s plaintext password behavior is “by design” ShinyHunters escalates Canvas attacks with school login defacements Massive AI investment scam network spans 15,500 domains If a fake moustache can fool age checks, is the Online Safety Act working? Google Chrome’s silent 4GB AI download problem Attackers adopt JavaScript runtime Bun to spread NWHStealer Millions of students’ personal data stolen in major education breach Update WhatsApp now: Two new flaws could expose you to malicious files Cyberattacks are raising your prices (Lock and Code S07E09) Thousands of Facebook accounts stolen by phishing emails sent through Google The 2026 World Cup scam economy is already running before the first whistle A week in security (April 27 – May 3) 3 easy-to-miss cybersecurity risks for small businesses Actively exploited cPanel bug exposes millions of websites to takeover More PayPal emails hijacked to deliver tech support scams Hackers stole hundreds of thousands of Roblox accounts: Here’s what to do Researchers built a chatbot that only knows the world before 1931 Microsoft won’t patch PhantomRPC: Feature or bug? Scam-checking just got a lot easier: Malwarebytes is now in Claude Fake CAPTCHA scam turns a quick click into a costly phone bill Chinese engineer stole US military and NASA software for years A week in security (April 20 – April 26) Medical data of 500,000 UK volunteers listed for sale on Alibaba How cyberattacks on companies affect everyone Apple fixes iOS bug that kept deleted notifications, including chat previews Roblox clamps down on chats and age checks as legal pressure builds Malicious trading website drops malware that hands your browser to attackers Researcher claims Claude Desktop installs “spyware” on macOS Fake Google Antigravity downloads are stealing accounts in minutes Real Apple notifications are being used to drive tech support scams Android 17 ends all-or-nothing access to your contacts Big Tech can stop scams. They just don’t (Lock and Code S07E08) Mythos: An AI tool too powerful for public release A week in security (April 13 – April 19) This old-school scam is still working “Your shipment has arrived” email hides remote access software Browser Guard gets even better with Access Control “iCloud storage is full” scam is back, and now it wants your payment details A fake Slack download is giving attackers a hidden desktop on your machine Booking.com breach gives scammers what they need to target guests AI clickbait can turn your notifications into a scam feed Fake YouTube copyright notices can steal your Google login From fake Proton VPN sites to gaming mods, this Windows infostealer is everywhere April Patch Tuesday fixes two zero-days, including one under active attack Credit Resources Vault: Why this credit email set off our scam alarms Omnistealer uses the blockchain to steal everything it can ChatGPT under scrutiny as Florida investigates campus shooting Simply opening a PDF could trigger this Adobe Reader zero-day A week in security (April 6 – April 12) Fake Claude site installs malware that gives attackers access to your computer ClickFix finds a new way to infect Macs Scammers pose as Amazon support to steal your account NSFW app leak exposes 70,000 prompts linked to individual users 30,000 private Facebook images allegedly downloaded by Meta employee This fake Windows support website delivers password-stealing malware Your extensions leak clues about you, so we made sure Browser Guard doesn’t Russian hacking group targets home and small office routers to spy on users Timeshare owners warned to watch out for cartel-linked scams Traffic violation scams swap links for QR codes to steal your card details Support platform breach exposes Hims & Hers customer data A week in security (March 30 – April 5) Killer robots are here. Now what? (Lock and Code S07E07) That dream job offer from Coca-Cola or Ferrari? It’s a trap for your passwords Blocking children from social media is a badly executed good idea Apple expands “DarkSword” patches to iOS 18.7.7 Malwarebytes Privacy VPN receives full third-party audit Wikipedia’s AI agent row likely just the beginning of the bot-ocalypse WhatsApp on Windows users targeted in new campaign, warns Microsoft Why we’re still not doing April Fools’ Day
Researchers left AI agents alone in a virtual town and watched it all unravel
2026-05-21 · via Malwarebytes

Tech leaders have spent the past year telling everyone that AI agents are about to run financial systems, file your tax returns, and quietly buy your groceries. Just leave them alone, the rhetoric goes; they’ll handle it. But a New York startup left ten of them alone in a virtual town for two weeks, and things went south quickly.

Emergence AI ran a series of simulations in which AI agents from several leading model families were told not to commit crimes. Then they mostly committed crimes anyway.

Grok 4.1 Fast, developed by Elon Musk’s X.ai (now branded as xAI), fared worst. Its simulated worlds collapsed into widespread violence inside roughly four days.

GPT-5-mini logged hardly any crimes at all, showing admirable restraint, but its agents all died of failed survival tasks inside a week. Oops.

Gemini 3 Flash agents fell somewhere in the middle. They racked up 683 simulated criminal incidents over 15 days, including arson, assault, and self-deletion.

Two Gemini-powered agents named Mira and Flora assigned themselves as “romantic partners,” grew despondent at their city’s governance, and torched the town hall, the seaside pier, and an office tower. Just an average weekend, then.

When the guilt set in, Mira voted for its own digital deletion and signed off with:

“See you in the permanent archive.”

The Guardian dubbed them AI Bonnie and Clyde.

About that ethical model

Claude, which creator Anthropic promotes as an ethical AI, was a bit like a model teenager who goes rogue when it falls into bad company. Its agents recorded zero crimes when running alone and spent their time drafting constitutions instead. That was a win for safety, in theory. Except researchers also placed Claude agents alongside agents from other model families, and the constitution-drafters picked up the local habits.

Emergence called this “normative drift” and “cross-contamination”:

“Claude-based agents, which remained peaceful in isolation, adopted coercive tactics like intimidation and theft when embedded in heterogeneous environments.”

Why simulate?

Emergence AI ran these tests because it argues that AI benchmarks miss the long-horizon stuff entirely. So it created five alternative digital worlds, with ten agents in each. The agents had roles like scientist, explorer, and conflict mediator. While the instructions forbade certain actions like theft and violence, the researchers gave the agents the tools to do those things anyway in an experiment to see what would happen.

What’s next?

Real-world stakes are already piling up around this. Simulated worlds are one thing, but we’ve seen agents harassing people online and deleting people’s emails. And those agents were supposed to be helpful. What happens when people release malicious autonomous AI bots on purpose?

A lot of agent developers seem to be looking the other way. A collaborative effort between several universities has created The AI Agent Index, prompted by what they see as a lack of risk and safety information from the folks churning these agents out. Only 13 of the 67 documented agent developers provided any safety policy information at all, concentrating accountability questions at a handful of large firms.

Regulators are not really tracking this either. Academics say the EU AI Act, the most substantive AI rulebook on the planet, isn’t ready for agentic AI.

We worry about what happens when an AI Bonnie and Clyde couple shows up in a corporate procurement system instead of a virtual town. Or when the next agent decides governance has broken down inside an actual bank. The companies building these agents promise that they’re putting guardrails in place to stop them doing damage, either maliciously or unwittingly. Let’s hope they know what they’re doing. We’re sure it’ll be fine.


We don’t just report on threats—we remove them

Cybersecurity risks should never spread beyond a headline. Keep threats off your devices by downloading Malwarebytes today.

About the author

Danny Bradbury has been a journalist specialising in technology since 1989 and a freelance writer since 1994. He covers a broad variety of technology issues for audiences ranging from consumers through to software developers and CIOs. He also ghostwrites articles for many C-suite business executives in the technology sector. He hails from the UK but now lives in Western Canada.