惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

Last Week in AI
Last Week in AI
G
GRAHAM CLULEY
博客园 - 【当耐特】
博客园_首页
A
About on SuperTechFans
The GitHub Blog
The GitHub Blog
aimingoo的专栏
aimingoo的专栏
S
SegmentFault 最新的问题
Jina AI
Jina AI
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
Schneier on Security
Schneier on Security
C
Cisco Blogs
博客园 - 叶小钗
T
Tailwind CSS Blog
Hacker News - Newest:
Hacker News - Newest: "LLM"
博客园 - 聂微东
S
Security Archives - TechRepublic
C
CXSECURITY Database RSS Feed - CXSecurity.com
Hacker News: Ask HN
Hacker News: Ask HN
Y
Y Combinator Blog
Google Online Security Blog
Google Online Security Blog
Blog — PlanetScale
Blog — PlanetScale
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
量子位
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
Cyberwarzone
Cyberwarzone
MongoDB | Blog
MongoDB | Blog
有赞技术团队
有赞技术团队
Spread Privacy
Spread Privacy
酷 壳 – CoolShell
酷 壳 – CoolShell
E
Exploit-DB.com RSS Feed
T
Tor Project blog
大猫的无限游戏
大猫的无限游戏
P
Privacy & Cybersecurity Law Blog
L
LINUX DO - 最新话题
V
Visual Studio Blog
SecWiki News
SecWiki News
V
V2EX
T
Troy Hunt's Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Help Net Security
Help Net Security
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Proofpoint News Feed
N
News and Events Feed by Topic
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
云风的 BLOG
云风的 BLOG
C
Comments on: Blog
S
Schneier on Security
H
Heimdal Security Blog
博客园 - Franky

cs.CL updates on arXiv.org

DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines Learning Adaptive Reasoning Paths for Efficient Visual Reasoning AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation Neuro-Oracle: A Trajectory-Aware Agentic RAG Framework for Interpretable Epilepsy Surgical Prognosis The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models Rethinking Patient Education as Multi-turn Multi-modal Interaction Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis One RL to See Them All: Visual Triple Unified Reinforcement Learning VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval HERMES: KV Cache as Hierarchical Memory for Efficient Streaming Video Understanding Counting Without Numbers and Finding Without Words KnowRL: Exploring Knowledgeable Reinforcement Learning for Factuality POP: Prefill-Only Pruning for Efficient Large Model Inference ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers AdaSplash-2: Faster Differentiable Sparse Attention Can Large Language Models Detect Methodological Flaws? Evidence from Gesture Recognition for UAV-Based Rescue Operation Based on Deep Learning Decoupling Scores and Text: The Politeness Principle in Peer Review Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters Indexing Multimodal Language Models for Large-scale Image Retrieval UI-Zoomer: Uncertainty-Driven Adaptive Zoom-In for GUI Grounding SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments Caption First, VQA Second: Knowledge Density, Not Task Format, Drives Multimodal Scaling PersonaVLM: Long-Term Personalized Multimodal LLMs MERRIN: A Benchmark for Multimodal Evidence Retrieval and Reasoning in Noisy Web Environments MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking Reward Design for Physical Reasoning in Vision-Language Models When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? Failure Makes the Agent Stronger: Enhancing Accuracy through Structured Reflection for Reliable Tool Interactions Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization ChartNet: A Million-Scale, High-Quality Multimodal Dataset for Robust Chart Understanding VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows Chain of Uncertain Rewards with Large Language Models for Reinforcement Learning (How) Learning Rates Regulate Catastrophic Overtraining Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning $\pi$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data From $P(y|x)$ to $P(y)$: Investigating Reinforcement Learning in Pre-train Space A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage Lossless Prompt Compression via Dictionary-Encoding and In-Context Learning: Enabling Cost-Effective LLM Analysis of Repetitive Data Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing LiveClawBench: Benchmarking LLM Agents on Complex, Real-World Assistant Tasks Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size Functional Emotions or Situational Contexts? A Discriminating Test from the Mythos Preview System Card C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference WorkRB: A Community-Driven Evaluation Framework for AI in the Work Domain Text-as-Signal: Quantitative Semantic Scoring with Embeddings, Logprobs, and Noise Reduction A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews A Proactive EMR Assistant for Doctor-Patient Dialogue: Streaming ASR, Belief Stabilization, and Preliminary Controlled Evaluation Token Statistics Reveal Conversational Drift in Multi-turn LLM Interaction Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub Correct Chains, Wrong Answers: Dissociating Reasoning from Output in LLM Logic Curation of a Palaeohispanic Dataset for Machine Learning EVE: A Domain-Specific LLM Framework for Earth Intelligence OmniTrace: A Unified Framework for Generation-Time Attribution in Omni-Modal LLMs DeEscalWild: A Real-World Benchmark for Automated De-Escalation Training with SLMs Document-tuning for robust alignment to animals Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports? IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization InfiniteScienceGym: An Unbounded, Procedurally-Generated Benchmark for Scientific Analysis Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection Hessian-Enhanced Token Attribution (HETA): Interpreting Autoregressive LLMs L2D-Clinical: Learning to Defer for Adaptive Model Selection in Clinical Text Classification English is Not All You Need: Systematically Exploring the Role of Multilinguality in LLM Post-Training Giving Voice to the Constitution: Low-Resource Text-to-Speech for Quechua and Spanish Using a Bilingual Legal Corpus AgentSPEX: An Agent SPecification and EXecution Language Peer-Predictive Self-Training for Language Model Reasoning TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints From Prediction to Justification: Aligning Sentiment Reasoning with Human Rationale via Reinforcement Learning CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding Using reasoning LLMs to extract SDOH events from clinical notes ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate Training-Free Test-Time Contrastive Learning for Large Language Models YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning BenGER Platform: A Collaborative Web Platform for End-to-End Benchmarking of German Legal Tasks Foresight Optimization for Strategic Reasoning in Large Language Models Syn-TurnTurk: A Synthetic Dataset for Turn-Taking Prediction in Turkish Dialogues IndicDB -- Benchmarking Multilingual Text-to-SQL Capabilities in Indian Languages Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection Beyond Arrow's Impossibility: Fairness as an Emergent Property of Multi-Agent Collaboration Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs
Multi-component Causal Tracing in Large Language Models
[Submitted on 2 Jun 2026] · 2026-06-03 · via cs.CL updates on arXiv.org

View PDF HTML (experimental)

Abstract:Causal tracing systematically intervenes on a large language model's (LLM's) internal representations to uncover and quantify the causal pathways linking specific inputs or computations to specific metrics of interest, quantifying the LLM's behavior. Building on previous single-component or single-layer studies, this paper presents a unified framework for causally tracing multiple components simultaneously. This framework systematically identifies the subsets of components (e.g., attention heads and multi-layer perceptron neurons) most critical to a desired target performance metric (e.g., accuracy and fairness). This is achieved by incorporating flexible interventions applied to a wide range of desired metrics. To address the combinatorial complexity of the multi-component problem, an efficient algorithm is designed that leverages soft interventions and a carefully designed metric transformation, converting the combinatorial search problem into a continuous one that can be solved efficiently under proper constraints, thereby generating proper binary decisions for selecting components. Experimental results demonstrate that the proposed method efficiently identifies subsets of the model's components that have a high impact on the target metric, outperforming existing baseline approaches. Our code is available at this https URL.

Submission history

From: Zirui Yan [view email]
[v1] Tue, 2 Jun 2026 03:15:35 UTC (1,027 KB)