惯性聚合
高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文
在惯性聚合中打开
即将跳转到惯性聚合
3
在聚合应用中查看完整内容和互动
立即跳转
取消
推荐订阅源
IT之家
Martin Fowler
博
博客园 - 【当耐特】
博
博客园_首页
博
博客园 - 三生石上(FineUI控件)
L
LangChain Blog
GbyAI
H
Help Net Security
酷 壳 – CoolShell
让小产品的独立变现更简单 - ezindie.com
阮一峰的网络日志
The GitHub Blog
D
DataBreaches.Net
美
美团技术团队
大猫的无限游戏
钛媒体:引领未来商业与生活新知
V
V2EX
有赞技术团队
MyScale Blog
月光博客
Stack Overflow Blog
Stack Overflow Blog
OSCHINA 社区最新新闻
Last Week in AI
博
博客园 - 司徒正美
H
Hackread – Cybersecurity News, Data Breaches, AI and More
T
The Blog of Author Tim Ferriss
WordPress大学
Hugging Face - Blog
博
博客园 - Franky
人人都是产品经理
博
博客园 - 聂微东
雷峰网
Google DeepMind News
奇客Solidot–传递最新科技情报
Jina AI
B
Blog
V
Visual Studio Blog
B
Blog RSS Feed
Microsoft Security Blog
AWS News Blog
L
LINUX DO - 最新话题
Know Your Adversary
T
Tor Project blog
S
SegmentFault 最新的问题
F
Future of Privacy Forum
The Hacker News
李
李成银的技术随笔
P
Proofpoint News Feed
T
True Tiger Recordings
LessWrong
Sentient Welfare Across Three Futures — LessWrong
Linkpost: New Vatican Encyclical on AI Governance — LessWrong
How AI Will Save Prediction Markets
There should be a discussion about LW's policy to allow calls for violence
Character-trained models can struggle to generalise — LessWrong
Applications open for the Secure Program Synthesis Fellowship — LessWrong
Announcing the Frontier Biodefense Fellowship (deadline 2 June)
We Need Unhobbled Donors — LessWrong
Taxing Small Cars To Improve MPG — LessWrong
A (Slightly) Mechanistic Theory for Exponentially Increasing AI Time Horizons?
Neurogastronomic Phenomenology for Advanced Beginners, Applied and Pure
Heretical Pasta
Veganism is Virtuous, not Obligatory — LessWrong
Veganism is Virtuous but not Obligatory — LessWrong
Low Expectancy is Not a Confidence Problem
Basic principles for dressing better.
Boltzmann brains, like Doomsday, require no explaining — LessWrong
Probabilities are not the right concept — LessWrong
Your Left Brain Doesn't Trade With Your Right — LessWrong
The Fundamentals of Cogitism: Grounding Ethics in the Nature of Consciousness — LessWrong
The Leaky AI Safety Pipeline — LessWrong
Can Large Language Models Identify Novel Threats? Part 1: Mirror Life and the Classification Gap — LessWrong
Capitalism is only the first of our problems — LessWrong
A political movement will save us from extinction
How should we update on AI-enabled coups post-Mythos? — LessWrong
Out-of-Context Reasoning (OOCR) in LLMs: A Short Primer and Reading List
Looking for backdoors in Jane Street LLMs
PLA Daily Translation: Reflections on Warfare Brought by AGI — LessWrong
Will we really put data centers in space? — LessWrong
We made a map of the doom debate — LessWrong
Which technical AI safety fields are going to be automated first?
Gemini 3.5 Flash Looks Good For How Fast It Is — LessWrong
The AI Industrial Explosion — Part 3: Going faster — LessWrong
Strong Longtermism Is Simply Correct — LessWrong
Notes on Collaborating with Claude Opus — LessWrong
Proposal for "Timelines to what": DIAL distribution — LessWrong
Insurance Premiums To The Moon
AI is Not Normal Technology
Counting Arguments in AI Safety — LessWrong
You can opt out of allergies — LessWrong
Moderator's Principle of Least Surprise
Possible red is red
Apr-May 2026 AI Security via Formal Methods — LessWrong
An Introduction to Neo-Fatalism — LessWrong
Loss of Oversight: How AI Systems May Become Harder to Audit, Monitor, and Investigate
What am I, if not an AI? — LessWrong
AI #169: New Knowledge
Learned Chain-of-Thought Obfuscation Generalises to Unseen Tasks
Numb mental state shifts — LessWrong
Women should be able to open things — LessWrong
Why are people so scared of causing fear?
Document-tuning instills durable animal compassion in LLMs (and generalizes to humans)
What About Us?
The Whole Kitten-Cavoodle
Why does off-model SFT degrade capabilities? — LessWrong
If I Were Emperor of New AI Safety Researcher Training... — LessWrong
theory uplift differentially benefits safety & is underleveraged
Singular Learning Theory Comprehensive - 1 — LessWrong
Sparse Efficiency vs. Superposition: The Interpretability Tradeoff — LessWrong
The Case for Evaluating Model Behaviors
Toward Interoperability of Minimal Programs — LessWrong
Fundamental Uncertainty $2,000 Essay Contest — LessWrong
Check out my technological uplifting, civilization-building, and science in a magic world fiction!
Synthetic Persona Pretraining: Alignment from Token Zero — LessWrong
Give my children minds — LessWrong
Power-seeking agents will likely be developed — LessWrong
Apply now to Human-Aligned AI Summer School 2026 — LessWrong
From 8B to Frontier: How System Prompts Control Whether AI Agents Blackmail, Leak, and Kill — LessWrong
If AI is normal technology, history is not reassuring.
Pythagorean addition — LessWrong
So you don't want everybody to die — LessWrong
Temporal Proportional Representation
Conclave 1492
Childhood And Education #19: Letting Kids Be Kids #2 — LessWrong
Implications Of Predicting The Next Token
Housing Roundup #15: The War Against Renters
Leaving DCA to the North on Foot
A Visual Guide to Natural Latents — LessWrong
Humans are not automatically strategic — "inner work" edition
Cyborg Uplift Studies
We Need to Get Serious about Uplift Studies
Brain Structure and IQ: How Myelin Elevates Intelligence
Sealing Conditional Misalignment in Inoculation Prompting with Consistency Training
Let's have more partial insiders.
Roadmap through AI safety programs for early-career technical researchers
Should Rationalists Looksmaxx?
When Fluency Is Free
AI emotions and aligned behavior
Tracking Difficulty with Feature Portfolios
Outsiders should focus on specs/constitutions
Outsiders should focus on specs/constitutions (among other things)
Logical Share Splitting for Intuitionists
Coordinal: A Postmortem.
Noticing Confusion: A practice in staying curious
Dating Roundup #12: Sex and Violence
Negation Neglect: When models fail to learn negations in training
So are you some kind of communist?
Thoughts on interviewing candidates for AI safety fellowships
PauseAI Munich Local Group Kickoff
Classifier Context Rot: Monitor Performance Degrades with Context Length
Cognitive Security as an AI Safety Cause Area — LessWrong
jsteinhardt
·
2026-05-26
·
via
LessWrong
x
Cognitive Security as an AI Safety Cause Area — LessWrong
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。
原文来自
— 版权归原作者所有。