惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Fortinet All Blogs
Apple Machine Learning Research
Apple Machine Learning Research
博客园 - Franky
Cisco Talos Blog
Cisco Talos Blog
E
Exploit-DB.com RSS Feed
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
C
Cybersecurity and Infrastructure Security Agency CISA
WordPress大学
WordPress大学
Scott Helme
Scott Helme
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
AWS News Blog
AWS News Blog
小众软件
小众软件
V
V2EX
The Cloudflare Blog
阮一峰的网络日志
阮一峰的网络日志
T
Tenable Blog
PCI Perspectives
PCI Perspectives
博客园 - 三生石上(FineUI控件)
A
Arctic Wolf
Security Latest
Security Latest
腾讯CDC
The GitHub Blog
The GitHub Blog
Help Net Security
Help Net Security
N
Netflix TechBlog - Medium
IT之家
IT之家
NISL@THU
NISL@THU
S
Securelist
F
Full Disclosure
J
Java Code Geeks
Microsoft Azure Blog
Microsoft Azure Blog
人人都是产品经理
人人都是产品经理
Recorded Future
Recorded Future
Martin Fowler
Martin Fowler
B
Blog RSS Feed
Y
Y Combinator Blog
H
Heimdal Security Blog
Jina AI
Jina AI
博客园 - 聂微东
The Register - Security
The Register - Security
有赞技术团队
有赞技术团队
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
爱范儿
爱范儿
博客园 - 司徒正美
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
www.infosecurity-magazine.com
www.infosecurity-magazine.com
H
Help Net Security
量子位
L
LINUX DO - 最新话题
aimingoo的专栏
aimingoo的专栏

Goodfire Research

Predictive Data Debugging: Reveal and Shape What Your Model Learns, Before You Train Logits as a new monitor for evaluation awareness Predicting Rare LLM Failures with 30× Fewer Rollouts The Shape of Stories Inside Neural Networks Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention Can SAEs Capture Neural Geometry? Steering Along Manifolds to Control Neural Networks The Neural Geometry Series The World Inside Neural Networks Verbalized Eval Awareness Inflates Measured Safety Paper Summary: Interpreting Language Model Parameters Interpreting Language Model Parameters Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training Using Self-Correcting Search to Accelerate Materials Discovery Explaining 4.2 million genetic variants with state-of-the-art, interpretable predictions Covariance-based Sequence Pooling Reasoning Theater: Probing for Performative Chain-of-Thought Features as Rewards: Using Interpretability to Reduce Hallucinations Using Interpretability to Identify a Novel Class of Alzheimer's Biomarkers Understanding Memorization via Loss Curvature Deploying Interpretability to Production with Rakuten: SAE Probes for PII Detection Interpreting Evo 2: Arc Institute's Next-Generation Genomic Foundation Model Mapping the Latent Space of Llama 3.3 70B Understanding and Steering Llama 3 with Sparse Autoencoders Discovering Undesired Rare Behaviors via Model Diff Amplification Open Problems in Mechanistic Interpretability Understanding Sparse Autoencoder Scaling in the Presence of Feature Manifolds Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering Priors in Time: Missing Inductive Biases for Language Model Interpretability Adversarial Examples Are Not Bugs, They Are Superposition Painting With Concepts Using Diffusion Model Latents Under the Hood of a Reasoning Model Finding the Tree of Life in Evo 2 The Circuits Research Landscape: Results and Perspectives Towards Scalable Parameter Decomposition Replicating Circuit Tracing for a Simple Known Mechanism
A Geometric Calculator Inside a Neural Network
Sheridan Feucht*,1,2 · 2026-05-21 · via Goodfire Research
We found a neural mechanism that operates over manifolds: a general-purpose addition module inside Llama 3.1 …