惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
Threatpost
aimingoo的专栏
aimingoo的专栏
雷峰网
雷峰网
腾讯CDC
酷 壳 – CoolShell
酷 壳 – CoolShell
WordPress大学
WordPress大学
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
S
SegmentFault 最新的问题
V
V2EX
M
MIT News - Artificial intelligence
博客园 - 【当耐特】
爱范儿
爱范儿
大猫的无限游戏
大猫的无限游戏
Blog — PlanetScale
Blog — PlanetScale
GbyAI
GbyAI
L
LangChain Blog
H
Hackread – Cybersecurity News, Data Breaches, AI and More
B
Blog
G
Google Developers Blog
P
Privacy International News Feed
Cisco Talos Blog
Cisco Talos Blog
AI
AI
宝玉的分享
宝玉的分享
H
Hacker News: Front Page
T
Tenable Blog
N
News and Events Feed by Topic
美团技术团队
阮一峰的网络日志
阮一峰的网络日志
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
博客园 - Franky
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
J
Java Code Geeks
Microsoft Security Blog
Microsoft Security Blog
W
WeLiveSecurity
小众软件
小众软件
T
Troy Hunt's Blog
F
Full Disclosure
Engineering at Meta
Engineering at Meta
Y
Y Combinator Blog
Security Latest
Security Latest
T
The Blog of Author Tim Ferriss
C
Cisco Blogs
Forbes - Security
Forbes - Security
有赞技术团队
有赞技术团队
C
Check Point Blog
Martin Fowler
Martin Fowler
C
Cyber Attacks, Cyber Crime and Cyber Security
F
Fortinet All Blogs
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
E
Exploit-DB.com RSS Feed

cs.CL updates on arXiv.org

Toward Generalized Cross-Lingual Hateful Language Detection with Web-Scale Data and Ensemble LLM Annotations Self-Calibrating Language Models via Test-Time Discriminative Distillation HumorGen: Cognitive Synergy for Humor Generation in Large Language Models via Persona-Based Distillation Claim2Vec: Embedding Fact-Check Claims for Multilingual Similarity and Clustering Spoiler Alert: Narrative Forecasting as a Metric for Tension in LLM Storytelling Human vs. Machine Deception: Distinguishing AI-Generated and Human-Written Fake News Using Ensemble Learning Weird Generalization is Weirdly Brittle Mirroring Minds: Asymmetric Linguistic Accommodation and Diagnostic Identity in ADHD and Autism Reddit Communities Why Supervised Fine-Tuning Fails to Learn: A Systematic Study of Incomplete Learning in Large Language Models Reason Only When Needed: Efficient Generative Reward Modeling via Model-Internal Uncertainty SEPTQ: A Simple and Effective Post-Training Quantization Paradigm for Large Language Models Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry Training-Free Cross-Lingual Dysarthria Severity Assessment via Phonological Subspace Analysis in Self-Supervised Speech Representations Simulating Organized Group Behavior: New Framework, Benchmark, and Analysis ODUTQA-MDC: A Task for Open-Domain Underspecified Tabular QA with Multi-turn Dialogue-based Clarification Nationality encoding in language model hidden states: Probing culturally differentiated representations in persona-conditioned academic text Relational Probing: LM-to-Graph Adaptation for Financial Prediction CodeComp: Structural KV Cache Compression for Agentic Coding FAITH: Factuality Alignment through Integrating Trustworthiness and Honestness Comparative Analysis of Large Language Models in Healthcare Adaptive Multi-Expert Reasoning via Difficulty-Aware Routing and Uncertainty-Guided Aggregation A Structured Clustering Approach for Inducing Media Narratives NameBERT: Scaling Name-Based Nationality Classification with LLM-Augmented Open Academic Data LASQ: A Low-resource Aspect-based Sentiment Quadruple Extraction Dataset BLUEmed: Retrieval-Augmented Multi-Agent Debate for Clinical Error Detection Turing or Cantor: That is the Question NOSE: Neural Olfactory-Semantic Embedding with Tri-Modal Orthogonal Contrastive Learning Instruction Data Selection via Answer Divergence EviCare: Enhancing Diagnosis Prediction with Deep Model-Guided Evidence for In-Context Reasoning Dynamic Adaptive Attention and Supervised Contrastive Learning: A Novel Hybrid Framework for Text Sentiment Classification Structure-Grounded Knowledge Retrieval via Code Dependencies for Multi-Step Data Reasoning Knowing What to Stress: A Discourse-Conditioned Text-to-Speech Benchmark HeceTokenizer: A Syllable-Based Tokenization Approach for Turkish Retrieval BlasBench: An Open Benchmark for Irish Speech Recognition OccuBench: Evaluating AI Agents on Real-World Professional Tasks via Language Environment Simulation How Robust Are Large Language Models for Clinical Numeracy? An Empirical Study on Numerical Reasoning Abilities in Clinical Contexts Evaluating Memory Capability in Continuous Lifelog Scenario Polyglot Teachers: Evaluating Language Models for Multilingual Synthetic Data Generation Hidden Measurement Error in LLM Pipelines Distorts Annotation, Evaluation, and Benchmarking LangFlow: Continuous Diffusion Rivals Discrete in Language Modeling ProGAL-VLA: Grounded Alignment through Prospective Reasoning in Vision-Language-Action Models Reproduction Beyond Benchmarks: ConstBERT and ColBERT-v2 Across Backends and Query Distributions Hijacking Text Heritage: Hiding the Human Signature through Homoglyphic Substitution SpectralLoRA: Is Low-Frequency Structure Sufficient for LoRA Adaptation? A Spectral Analysis of Weight Updates Revisiting Compositionality in Dual-Encoder Vision-Language Models: The Role of Inference What Factors Affect LLMs and RLLMs in Financial Question Answering? KCS: Diversify Multi-hop Question Generation with Knowledge Composition Sampling Preference Learning Unlocks LLMs' Psycho-Counseling Skills Aligning What LLMs Do and Say: Towards Self-Consistent Explanations StableToken: A Noise-Robust Semantic Speech Tokenizer for Resilient SpeechLLMs Beyond Black-Box Interventions: Latent Probing for Faithful Retrieval-Augmented Generation C-ReD: A Comprehensive Chinese Benchmark for AI-Generated Text Detection Derived from Real-World Prompts Evaluating Cooperation in LLM Social Groups through Elected Leadership A Triadic Suffix Tokenization Scheme for Numerical Reasoning Policy Split: Incentivizing Dual-Mode Exploration in LLM Reinforcement with Dual-Mode Entropy Regularization METER: Evaluating Multi-Level Contextual Causal Reasoning in Large Language Models Anthropogenic Regional Adaptation in Multimodal Vision-Language Model METRO: Towards Strategy Induction from Expert Dialogue Transcripts for Non-collaborative Dialogues Towards Proactive Information Probing: Customer Service Chatbots Harvesting Value from Conversation A Systematic Analysis of the Impact of Persona Steering on LLM Capabilities Linear Representations of Hierarchical Concepts in Language Models TInR: Exploring Tool-Internalized Reasoning in Large Language Models Teaching Language Models How to Code Like Learners: Conversational Serialization for Student Simulation SCOPE: Signal-Calibrated On-Policy Distillation Enhancement with Dual-Path Adaptive Weighting Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment Bridging Linguistic Gaps: Cross-Lingual Mapping in Pre-Training and Dataset for Enhanced Multilingual LLM Performance Early Decisions Matter: Proximity Bias and Initial Trajectory Shaping in Non-Autoregressive Diffusion Language Models LLMs Should Incorporate Explicit Mechanisms for Human Empathy ReFEree: Reference-Free and Fine-Grained Method for Evaluating Factual Consistency in Real-World Code Summarization From Query to Counsel: Structured Reasoning with a Multi-Agent Framework and Dataset for Legal Consultation CodaRAG: Connecting the Dots with Associativity Inspired by Complementary Learning The Amazing Agent Race: Strong Tool Users, Weak Navigators Think in Sentences: Explicit Sentence Boundaries Enhance Language Model's Capabilities CircuitSynth: Reliable Synthetic Data Generation ASPIRin: Action Space Projection for Interactivity-Optimized Reinforcement Learning in Full-Duplex Speech Language Models Computational Implementation of a Model of Category-Theoretic Metaphor Comprehension CoSToM:Causal-oriented Steering for Intrinsic Theory-of-Mind Alignment in Large Language Models FinTrace: Holistic Trajectory-Level Evaluation of LLM Tool Calling for Long-Horizon Financial Tasks Cross-Cultural Value Awareness in Large Vision-Language Models Should We be Pedantic About Reasoning Errors in Machine Translation? Instructing LLMs to Negotiate using Reinforcement Learning with Verifiable Rewards COMPOSITE-Stem GIANTS: Generative Insight Anticipation from Scientific Literature Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Generating High Quality Synthetic Data for Dutch Medical Conversations Generative UI: LLMs are Effective UI Generators LETGAMES: An LLM-Powered Gamified Approach to Cognitive Training for Patients with Cognitive Impairment Seven simple steps for log analysis in AI systems H-AdminSim: A Multi-Agent Simulator for Realistic Hospital Administrative Workflows with FHIR Integration LABBench2: An Improved Benchmark for AI Systems Performing Biology Research MCERF: Advancing Multimodal LLM Evaluation of Engineering Documentation with Enhanced Retrieval Reasoning Models Will Sometimes Lie About Their Reasoning Disco-RAG: Discourse-Aware Retrieval-Augmented Generation Think Parallax: Solving Multi-Hop Problems via Multi-View Knowledge-Graph-Based Retrieval-Augmented Generation FS-DFM: Fast and Accurate Long Text Generation with Few-Step Diffusion Language Models Echoes of Automation: The Increasing Use of LLMs in Newsmaking MegaFake: A Theory-Driven Dataset of Fake News Generated by Large Language Models Language Reconstruction with Brain Predictive Coding from fMRI Data Template-assisted Contrastive Learning of Task-oriented Dialogue Sentence Embeddings
Generative causal testing to bridge data-driven models and scientific theories in language neuroscience
Richard Antonello, Chandan Singh, Shailee Jain, Aliyah Hsu, Siha · 2024-10-01 · via cs.CL updates on arXiv.org

Representations from large language models are highly effective at predicting BOLD fMRI responses to language stimuli. However, these representations are largely opaque: it is unclear what features of the language stimulus drive the response in each brain area. We present generative causal testing (GCT), a framework for generating concise explanations of language selectivity in the brain from predictive models and then testing those explanations in follow-up experiments using LLM-generated stimuli.This approach is successful at explaining selectivity both in individual voxels and cortical regions of interest (ROIs), including newly identified microROIs in prefrontal cortex. We show that explanatory accuracy is closely related to the predictive power and stability of the underlying predictive models. Finally, we show that GCT can dissect fine-grained differences between brain areas with similar functional selectivity. These results demonstrate that LLMs can be used to bridge the widening gap between data-driven models and formal scientific theories.