惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

U
Unit 42
V
V2EX
Martin Fowler
Martin Fowler
博客园 - Franky
P
Proofpoint News Feed
P
Palo Alto Networks Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
B
Blog
The Register - Security
The Register - Security
Latest news
Latest news
S
Security @ Cisco Blogs
Simon Willison's Weblog
Simon Willison's Weblog
Recorded Future
Recorded Future
大猫的无限游戏
大猫的无限游戏
M
Microsoft Research Blog - Microsoft Research
Scott Helme
Scott Helme
T
Tailwind CSS Blog
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Application and Cybersecurity Blog
Application and Cybersecurity Blog
T
True Tiger Recordings
有赞技术团队
有赞技术团队
I
Intezer
Cisco Talos Blog
Cisco Talos Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
The GitHub Blog
The GitHub Blog
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
T
Tenable Blog
博客园 - 叶小钗
Hugging Face - Blog
Hugging Face - Blog
Hacker News: Ask HN
Hacker News: Ask HN
S
Security Archives - TechRepublic
F
Future of Privacy Forum
爱范儿
爱范儿
PCI Perspectives
PCI Perspectives
H
Help Net Security
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
T
The Blog of Author Tim Ferriss
MyScale Blog
MyScale Blog
N
Netflix TechBlog - Medium
罗磊的独立博客
Apple Machine Learning Research
Apple Machine Learning Research
MongoDB | Blog
MongoDB | Blog
Security Latest
Security Latest
美团技术团队
博客园 - 三生石上(FineUI控件)
S
Schneier on Security
量子位
C
CERT Recently Published Vulnerability Notes
SecWiki News
SecWiki News

LessWrong

Looking for backdoors in Jane Street LLMs PLA Daily Translation: Reflections on Warfare Brought by AGI — LessWrong Will we really put data centers in space? — LessWrong We made a map of the doom debate — LessWrong Which technical AI safety fields are going to be automated first? Gemini 3.5 Flash Looks Good For How Fast It Is — LessWrong The AI Industrial Explosion — Part 3: Going faster — LessWrong Strong Longtermism Is Simply Correct — LessWrong Notes on Collaborating with Claude Opus — LessWrong Proposal for "Timelines to what": DIAL distribution — LessWrong Insurance Premiums To The Moon AI is Not Normal Technology Counting Arguments in AI Safety — LessWrong You can opt out of allergies — LessWrong Moderator's Principle of Least Surprise Possible red is red Apr-May 2026 AI Security via Formal Methods — LessWrong An Introduction to Neo-Fatalism — LessWrong Loss of Oversight: How AI Systems May Become Harder to Audit, Monitor, and Investigate What am I, if not an AI? — LessWrong AI #169: New Knowledge Learned Chain-of-Thought Obfuscation Generalises to Unseen Tasks Numb mental state shifts — LessWrong Women should be able to open things — LessWrong Why are people so scared of causing fear? Document-tuning instills durable animal compassion in LLMs (and generalizes to humans) What About Us? The Whole Kitten-Cavoodle Why does off-model SFT degrade capabilities? — LessWrong If I Were Emperor of New AI Safety Researcher Training... — LessWrong theory uplift differentially benefits safety & is underleveraged Singular Learning Theory Comprehensive - 1 — LessWrong Sparse Efficiency vs. Superposition: The Interpretability Tradeoff — LessWrong The Case for Evaluating Model Behaviors Toward Interoperability of Minimal Programs — LessWrong Fundamental Uncertainty $2,000 Essay Contest — LessWrong Check out my technological uplifting, civilization-building, and science in a magic world fiction! Synthetic Persona Pretraining: Alignment from Token Zero — LessWrong Give my children minds — LessWrong Power-seeking agents will likely be developed — LessWrong Apply now to Human-Aligned AI Summer School 2026 — LessWrong From 8B to Frontier: How System Prompts Control Whether AI Agents Blackmail, Leak, and Kill — LessWrong If AI is normal technology, history is not reassuring. Pythagorean addition — LessWrong So you don't want everybody to die — LessWrong Temporal Proportional Representation Conclave 1492 Childhood And Education #19: Letting Kids Be Kids #2 — LessWrong Implications Of Predicting The Next Token Housing Roundup #15: The War Against Renters Leaving DCA to the North on Foot A Visual Guide to Natural Latents — LessWrong Humans are not automatically strategic — "inner work" edition Cyborg Uplift Studies We Need to Get Serious about Uplift Studies Brain Structure and IQ: How Myelin Elevates Intelligence Sealing Conditional Misalignment in Inoculation Prompting with Consistency Training Let's have more partial insiders. Roadmap through AI safety programs for early-career technical researchers Should Rationalists Looksmaxx? When Fluency Is Free AI emotions and aligned behavior Tracking Difficulty with Feature Portfolios Outsiders should focus on specs/constitutions Outsiders should focus on specs/constitutions (among other things) Logical Share Splitting for Intuitionists Coordinal: A Postmortem. Noticing Confusion: A practice in staying curious Dating Roundup #12: Sex and Violence Negation Neglect: When models fail to learn negations in training So are you some kind of communist? Thoughts on interviewing candidates for AI safety fellowships PauseAI Munich Local Group Kickoff Classifier Context Rot: Monitor Performance Degrades with Context Length How useful is cross-domain generalization for training LLM monitors? Jhana Quick Start Guide Links #1: 2026/05 Part 1 why pollen allergies? Why Physical Attractiveness Matters for Men's Dating Prospects Bay Summer Solstice 2026 How to Quit Fandom: Apostasy Engineering a Safer World: Risk Modelling — and Safety Engineering? — for AI Loss of Control Next Token Prediction is a Misleading Term Can ELK be brute-forced? Intertheoretic reduction James C. Scott: Seeing Like a State How to Reason about Your Health Issues Are You Not Rationalists? — LessWrong Falling for the statistical parrot — LessWrong On getting unstuck — LessWrong A relatively brief explanation of Boltzmann Brains — LessWrong Benchmarking Real Work — LessWrong Critique Systems, Not Reality Trying to use NLAs to find out how Qwen 2.5 7B does multiplication — LessWrong A Year Late, Claude Finally Beats Pokémon NLA Verbalizations on AuditBench: Llama 70B — LessWrong An Introduction to Exemplar Partitioning for Mechanistic Interpretability An Argument for Analogies—Polymaths 1/3 — LessWrong Incriminating misaligned AI models via distillation — LessWrong Critical Thinking as a Gym Schedule Why I am not too worried about AIpocalypse: Scott Alexander vs Nicolaus Copernicus — LessWrong
Out-of-Context Reasoning (OOCR) in LLMs: A Short Primer and Reading List
Owain_Evans · 2026-05-23 · via LessWrong
Out-of-context reasoning (OOCR) is a concept relevant to LLM generalization and AI alignment. Also available…