A Geometric Calculator Inside a Neural Network - 惯性聚合

推荐订阅源

Fortinet All Blogs

Apple Machine Learning Research

博客园 - Franky

Cisco Talos Blog

Exploit-DB.com RSS Feed

奇客Solidot–传递最新科技情报

Cybersecurity and Infrastructure Security Agency CISA

WordPress大学

freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

The Cloudflare Blog

阮一峰的网络日志

PCI Perspectives

博客园 - 三生石上(FineUI控件)

Security Latest

The GitHub Blog

Help Net Security

Netflix TechBlog - Medium

Full Disclosure

Java Code Geeks

Microsoft Azure Blog

人人都是产品经理

Recorded Future

Y Combinator Blog

Heimdal Security Blog

博客园 - 聂微东

The Register - Security

有赞技术团队

cs.AI updates on arXiv.org

博客园 - 司徒正美

Threat Intelligence Blog | Flashpoint

OSCHINA 社区最新新闻

www.infosecurity-magazine.com

Help Net Security

LINUX DO - 最新话题

aimingoo的专栏

Goodfire Research

Predictive Data Debugging: Reveal and Shape What Your Model Learns, Before You Train Logits as a new monitor for evaluation awareness Predicting Rare LLM Failures with 30× Fewer Rollouts The Shape of Stories Inside Neural Networks Why Larger Models Learn More: Effects of Capacity, Interference, and Rare-Task Retention Can SAEs Capture Neural Geometry? Steering Along Manifolds to Control Neural Networks The Neural Geometry Series The World Inside Neural Networks Verbalized Eval Awareness Inflates Measured Safety Paper Summary: Interpreting Language Model Parameters Interpreting Language Model Parameters Probe-Based Data Attribution: Surfacing and Mitigating Undesirable Behaviors in LLM Post-Training Using Self-Correcting Search to Accelerate Materials Discovery Explaining 4.2 million genetic variants with state-of-the-art, interpretable predictions Covariance-based Sequence Pooling Reasoning Theater: Probing for Performative Chain-of-Thought Features as Rewards: Using Interpretability to Reduce Hallucinations Using Interpretability to Identify a Novel Class of Alzheimer's Biomarkers Understanding Memorization via Loss Curvature Deploying Interpretability to Production with Rakuten: SAE Probes for PII Detection Interpreting Evo 2: Arc Institute's Next-Generation Genomic Foundation Model Mapping the Latent Space of Llama 3.3 70B Understanding and Steering Llama 3 with Sparse Autoencoders Discovering Undesired Rare Behaviors via Model Diff Amplification Open Problems in Mechanistic Interpretability Understanding Sparse Autoencoder Scaling in the Presence of Feature Manifolds Mixing Mechanisms: How Language Models Retrieve Bound Entities In-Context Belief Dynamics Reveal the Dual Nature of In-Context Learning and Activation Steering Priors in Time: Missing Inductive Biases for Language Model Interpretability Adversarial Examples Are Not Bugs, They Are Superposition Painting With Concepts Using Diffusion Model Latents Under the Hood of a Reasoning Model Finding the Tree of Life in Evo 2 The Circuits Research Landscape: Results and Perspectives Towards Scalable Parameter Decomposition Replicating Circuit Tracing for a Simple Known Mechanism

A Geometric Calculator Inside a Neural Network

Sheridan Feucht*,1,2 · 2026-05-21 · via Goodfire Research

此内容由惯性聚合(RSS阅读器)自动聚合整理，仅供阅读参考。原文来自 — 版权归原作者所有。