惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
Tor Project blog
V
Visual Studio Blog
WordPress大学
WordPress大学
S
SegmentFault 最新的问题
Jina AI
Jina AI
人人都是产品经理
人人都是产品经理
博客园 - 司徒正美
小众软件
小众软件
I
InfoQ
雷峰网
雷峰网
Recorded Future
Recorded Future
美团技术团队
博客园 - 【当耐特】
C
Check Point Blog
S
Securelist
Stack Overflow Blog
Stack Overflow Blog
Last Week in AI
Last Week in AI
P
Proofpoint News Feed
T
The Exploit Database - CXSecurity.com
宝玉的分享
宝玉的分享
Cyberwarzone
Cyberwarzone
Apple Machine Learning Research
Apple Machine Learning Research
Recent Announcements
Recent Announcements
NISL@THU
NISL@THU
博客园 - 三生石上(FineUI控件)
B
Blog
T
Threat Research - Cisco Blogs
博客园 - 聂微东
www.infosecurity-magazine.com
www.infosecurity-magazine.com
K
Kaspersky official blog
Security Latest
Security Latest
Google DeepMind News
Google DeepMind News
有赞技术团队
有赞技术团队
The Hacker News
The Hacker News
V
V2EX
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
C
Cisco Blogs
IT之家
IT之家
爱范儿
爱范儿
Scott Helme
Scott Helme
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
量子位
The GitHub Blog
The GitHub Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
大猫的无限游戏
大猫的无限游戏
T
Tailwind CSS Blog
T
Tenable Blog
Hugging Face - Blog
Hugging Face - Blog
The Cloudflare Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

cs.CL updates on arXiv.org

Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations Self-Calibrating Language Models via Test-Time Discriminative Distillation HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation Generating High Quality Synthetic Data for Dutch Medical Conversations GIANTS: Generative Insight Anticipation from Scientific Literature Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling Should We be Pedantic About Reasoning Errors in Machine Translation? Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning Weird Generalization is Weirdly Brittle Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry CircuitSynth: Reliable Synthetic Data Generation Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text Relational Probing: LM-to-Graph Adaptation for Financial Prediction CodeComp: Structural KV Cache Compression for Agentic Coding FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness Comparative Analysis of Large Language Models in Healthcare Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation A Structured Clustering Approach for Inducing Media Narratives NameBERT: Scaling Name-Based Nationality Classification with LLM-Augmented Open Academic Data LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection Turing or Cantor: That is the Question CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning NOSE: Neural Olfactory-Semantic Embedding with Tri-Modal Orthogonal Contrastive Learning Instruction Data Selection via Answer Divergence EviCare: Enhancing Diagnosis Prediction with Deep Model-Guided Evidence for In-Context Reasoning Dynamic Adaptive Attention and Supervised Contrastive Learning: A Novel Hybrid Framework for Text Sentiment Classification From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization LLMs Should Incorporate Explicit Mechanisms for Human Empathy Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models Knowing What to Stress: A Discourse-Conditioned Text-to-Speech Benchmark Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment BlasBench: An Open Benchmark for Irish Speech Recognition TInR: Exploring Tool-Internalized Reasoning in Large Language Models OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities How Robust Are Large Language Models for Clinical Numeracy? An Empirical Study on Numerical Reasoning Abilities in Clinical Contexts Evaluating Memory Capability in Continuous Lifelog Scenario Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization Hidden Measurement Error in LLM Pipelines Distorts Annotation, Evaluation, and Benchmarking A Triadic Suffix Tokenization Scheme for Numerical Reasoning Evaluating Cooperation in LLM Social Groups through Elected Leadership LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval Seven simple steps for log analysis in AI systems LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment Generative UI: LLMs are Effective UI Generators LABBench2: An Improved Benchmark for AI Systems Performing Biology Research ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models COMPOSITE-Stem Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards Cross-Cultural Value Awareness in Large Vision-Language Models Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution The Amazing Agent Race: Strong Tool Users, Weak Navigators SpectralLoRA: Is Low-Frequency Structure Sufficient for LoRA Adaptation? A Spectral Analysis of Weight Updates SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation Anthropogenic Regional Adaptation in Multimodal Vision-Language Model Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference What Factors Affect LLMs and RLLMs in Financial Question Answering? Echoes of Automation: The Increasing Use of LLMs in Newsmaking KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling Preference Learning Unlocks LLMs' Psycho-Counseling Skills FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models Aligning What LLMs Do and Say: Towards Self-Consistent Explanations StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Beyond Black-Box Interventions: Latent Probing for Faithful Retrieval-Augmented Generation Think Parallax: Solving Multi-Hop Problems via Multi-View Knowledge-Graph-Based Retrieval-Augmented Generation Disco-RAG: Discourse-Aware Retrieval-Augmented Generation GenProve: Learning to Generate Text with Fine-Grained Provenance Beyond RAG for Agent Memory: Retrieval by Decoupling and Aggregation ChemPro: A Progressive Chemistry Benchmark for Large Language Models ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA Reasoning Models Will Sometimes Lie About Their Reasoning Linear Representations of Hierarchical Concepts in Language Models H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration
Independent-Component-Based Encoding Models of Brain Activity During Story Comprehension
Kamya Hari, · 2026-04-29 · via cs.CL updates on arXiv.org

View PDF HTML (experimental)

Abstract:Encoding models provide a powerful framework for linking continuous stimulus features to neural activity; however, traditional voxelwise approaches are limited by measurement noise, inter-subject variability, and redundancy arising from spatially correlated voxels encoding overlapping neural signals. Here, we propose an independent component (IC)-based encoding framework that dissociates stimulus-driven and noise-driven signals in fMRI data. We decompose continuous fMRI data from naturalistic story listening into ICs using one subset of the data, and train encoding models on independent data to predict IC time series from large language model representations of linguistic input. Across subjects, a subset of ICs exhibited consistently high predictivity. These ICs were spatially and temporally consistent across subjects and included cognitive networks known to respond during story listening (auditory and language). Auditory component time series were strongly correlated with acoustic stimulus features, highlighting the interpretability of identified component time series. Components identified as noise or motion-related artifacts by ICA-AROMA showed uniformly poor predictive performance, confirming that highly predicted components reflect genuine stimulus-related neural signals rather than confounds. Overall, IC-based encoding models enable analyses at the level of functional networks, accommodating the variability in network locations across individuals and providing interpretable results that are easy to compare across subjects.
Subjects: Computation and Language (cs.CL); Neurons and Cognition (q-bio.NC)
Cite as: arXiv:2604.24942 [cs.CL]
  (or arXiv:2604.24942v1 [cs.CL] for this version)
  https://doi.org/10.48550/arXiv.2604.24942

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Kamya Hari [view email]
[v1] Mon, 27 Apr 2026 19:30:46 UTC (19,126 KB)