惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

大猫的无限游戏
大猫的无限游戏
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
AWS News Blog
AWS News Blog
V
V2EX - 技术
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
Cloudbric
Cloudbric
S
Securelist
L
LINUX DO - 最新话题
Scott Helme
Scott Helme
T
Threat Research - Cisco Blogs
S
Schneier on Security
Simon Willison's Weblog
Simon Willison's Weblog
G
GRAHAM CLULEY
I
Intezer
C
Cybersecurity and Infrastructure Security Agency CISA
C
CERT Recently Published Vulnerability Notes
SecWiki News
SecWiki News
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
TaoSecurity Blog
TaoSecurity Blog
D
Darknet – Hacking Tools, Hacker News & Cyber Security
Attack and Defense Labs
Attack and Defense Labs
S
Security Affairs
D
Docker
The Cloudflare Blog
博客园 - 三生石上(FineUI控件)
爱范儿
爱范儿
美团技术团队
W
WeLiveSecurity
阮一峰的网络日志
阮一峰的网络日志
月光博客
月光博客
Recent Commits to openclaw:main
Recent Commits to openclaw:main
博客园_首页
G
Google Developers Blog
C
Cisco Blogs
T
Tor Project blog
B
Blog RSS Feed
Vercel News
Vercel News
宝玉的分享
宝玉的分享
Recorded Future
Recorded Future
Cisco Talos Blog
Cisco Talos Blog
P
Palo Alto Networks Blog
Application and Cybersecurity Blog
Application and Cybersecurity Blog
E
Exploit-DB.com RSS Feed
PCI Perspectives
PCI Perspectives
K
Kaspersky official blog
量子位
Google Online Security Blog
Google Online Security Blog
Jina AI
Jina AI
Hacker News - Newest:
Hacker News - Newest: "LLM"
aimingoo的专栏
aimingoo的专栏

cs.CL updates on arXiv.org

Learning Adaptive Reasoning Paths for Efficient Visual Reasoning AIM: Asymmetric Information Masking for Visual Question Answering Continual Learning Knowing When Not to Answer: Evaluating Abstention in Multimodal Reasoning Systems One RL to See Them All: Visual Triple Unified Reinforcement Learning VisRet: Visualization Improves Knowledge-Intensive Text-to-Image Retrieval ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding LongAct: Harnessing Intrinsic Activation Patterns for Long-Context Reinforcement Learning AdaSplash-2: Faster Differentiable Sparse Attention Decoupling Scores and Text: The Politeness Principle in Peer Review Correcting Suppressed Log-Probabilities in Language Models with Post-Transformer Adapters MM-WebAgent: A Hierarchical Multimodal Web Agent for Webpage Generation OpenMobile: Building Open Mobile Agents with Task and Trajectory Synthesis What Is the Minimum Architecture for Prolepsis? Early Irrevocable Commitment Across Tasks in Small Transformers RaTA-Tool: Retrieval-based Tool Selection with Multimodal Large Language Models ADAPT: Benchmarking Commonsense Planning under Unspecified Affordance Constraints Reasoning Dynamics and the Limits of Monitoring Modality Reliance in Vision-Language Models Rethinking Patient Education as Multi-turn Multi-modal Interaction Indexing Multimodal Language Models for Large-scale Image Retrieval SpatialEvo: Self-Evolving Spatial Intelligence via Deterministic Geometric Environments PersonaVLM: Long-Term Personalized Multimodal LLMs MedRCube: A Multidimensional Framework for Fine-Grained and In-Depth Evaluation of MLLMs in Medical Imaging Who Gets Flagged? The Pluralistic Evaluation Gap in AI Content Watermarking When 'YES' Meets 'BUT': Can Large Models Comprehend Contradictory Humor Through Comparative Reasoning? Addressing Overthinking in Large Vision-Language Models via Gated Perception-Reasoning Optimization VLMs Need Words: Vision Language Models Ignore Visual Detail In Favor of Semantic Anchors Cracking the Code of Juxtaposition: Can AI Models Understand the Humorous Contradictions RadAgents: Multimodal Agentic Reasoning for Chest X-ray Interpretation with Radiologist-like Workflows (How) Learning Rates Regulate Catastrophic Overtraining Parameter Importance is Not Static: Evolving Parameter Isolation for Supervised Fine-Tuning $π$-Play: Multi-Agent Self-Play via Privileged Self-Distillation without External Data A Domain-Specific Language for LLM-Driven Trigger Generation in Multimodal Data Collection The Consciousness Cluster: Emergent preferences of Models that Claim to be Conscious KMMMU: Evaluation of Massive Multi-discipline Multimodal Understanding in Korean Language and Context Dental-TriageBench: Benchmarking Multimodal Reasoning for Hierarchical Dental Triage Detection Without Correction: A Robust Asymmetry in Activation-Based Hallucination Probing Better and Worse with Scale: How Contextual Entrainment Diverges with Model Size C2: Scalable Rubric-Augmented Reward Modeling from Binary Preferences Calibrated Speculative Decoding: Frequency-Guided Candidate Selection for Efficient Inference A Multi-Model Approach to English-Bangla Sentiment Classification of Government Mobile Banking App Reviews Mathematical Reasoning Enhanced LLM for Formula Derivation: A Case Study on Fiber NLI Modellin Red Skills or Blue Skills? A Dive Into Skills Published on ClawHub Can Large Language Models Reliably Extract Physiology Index Values from Coronary Angiography Reports? IWLV-Ramayana: A Sarga-Aligned Parallel Corpus of Valmiki's Ramayana Across Indian Languages Unleashing Implicit Rewards: Prefix-Value Learning for Distribution-Level Optimization Evaluating the Evaluator: Problems with SemEval-2020 Task 1 for Lexical Semantic Change Detection AgentSPEX: An Agent SPecification and EXecution Language TLoRA+: A Low-Rank Parameter-Efficient Fine-Tuning Method for Large Language Models Empirical Evidence of Complexity-Induced Limits in Large Language Models on Finite Discrete State-Space Problems with Explicit Validity Constraints CANVAS: Continuity-Aware Narratives via Visual Agentic Storyboarding Using reasoning LLMs to extract SDOH events from clinical notes ToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative Decoding Synthesizing Instruction-Tuning Datasets with Contrastive Decoding Debate to Align: Reliable Entity Alignment through Two-Stage Multi-Agent Debate YOCO++: Enhancing YOCO with KV Residual Connections for Efficient LLM Inference MM-Doc-R1: Training Agents for Long Document Visual Question Answering through Multi-turn Reinforcement Learning Foresight Optimization for Strategic Reasoning in Large Language Models Breaking the Generator Barrier: Disentangled Representation for Generalizable AI-Text Detection Co-FactChecker: A Framework for Human-AI Collaborative Claim Verification Using Large Reasoning Models Learning the Cue or Learning the Word? Analyzing Generalization in Metaphor Detection for Verbs An Empirical Investigation of Practical LLM-as-a-Judge Improvement Techniques on RewardBench 2 Doc-V*:Coarse-to-Fine Interactive Visual Reasoning for Multi-Page Document VQA QuantileMark: A Message-Symmetric Multi-bit Watermark for LLMs ToolOmni: Enabling Open-World Tool Use via Agentic learning with Proactive Retrieval and Grounded Execution MUSE: Multi-Domain Chinese User Simulation via Self-Evolving Profiles and Rubric-Guided Alignment Robust Reward Modeling for Large Language Models via Causal Decomposition Beyond Static Personas: Situational Personality Steering for Large Language Models Causal Drawbridges: Characterizing Gradient Blocking of Syntactic Islands in Transformer LMs Dual-Enhancement Product Bundling: Bridging Interactive Graph and Large Language Model From Where Words Come: Efficient Regularization of Code Tokenizers Through Source Attribution From Weights to Activations: Is Steering the Next Frontier of Adaptation? Interpretable Stylistic Variation in Human and LLM Writing Across Genres, Models, and Decoding Strategies Correct Prediction, Wrong Steps? Consensus Reasoning Knowledge Graph for Robust Chain-of-Thought Synthesis From Seeing it to Experiencing it: Interactive Evaluation of Intersectional Voice Bias in Human-AI Speech Interaction From Relevance to Authority: Authority-aware Generative Retrieval in Web Search Engines Hybrid Retrieval for COVID-19 Literature: Comparing Rank Fusion and Projection Fusion with Diversity Reranking CollabCoder: Plan-Code Co-Evolution via Collaborative Decision-Making for Efficient Code Generation Social media polarization during conflict: Insights from an ideological stance dataset on Israel-Palestine Reddit comments Bridging Compositional and Distributional Semantics: A Survey on Latent Semantic Geometry via AutoEncoder Language steering in latent space to mitigate unintended code-switching ParlaSpeech 3.0: Richly Annotated Spoken Parliamentary Corpora of Croatian, Czech, Polish, and Serbian LaoBench: A Large-Scale Multidimensional Lao Benchmark for Large Language Models Reducing Hallucinations in LLMs via Factuality-Aware Preference Learning Exposía: Teaching and Assessment of Academic Writing Skills for Research Project Proposals and Peer Feedback F-Actor: Controllable Conversational Behaviour in Full-Duplex Models Neural Chain-of-Thought Search: Searching the Optimal Reasoning Path to Enhance Large Language Models Agentic Conversational Search with Contextualized Reasoning via Reinforcement Learning Common to Whom? Regional Cultural Commonsense and LLM Bias in India Sparse or Dense? A Mechanistic Estimation of Computation Density in Transformer-based LLMs Kwame 2.0: Human-in-the-Loop Generative AI Teaching Assistant for Large Scale Online Coding Education in Africa RAG or Learning? Understanding the Limits of LLM Adaptation under Continuous Knowledge Drift in the Real World ValueGround: Evaluating Culture-Conditioned Visual Value Grounding in MLLMs Rag Performance Prediction for Question Answering Guaranteeing Knowledge Integration with Joint Decoding for Retrieval-Augmented Generation Masked by Consensus: Disentangling Privileged Knowledge in LLM Correctness Growing Pains: Extensible and Efficient LLM Benchmarking Via Fixed Parameter Calibration CodeFlowBench: A Multi-turn, Iterative Benchmark for Complex Code Generation Not All Tokens Matter: Towards Efficient LLM Reasoning via Token Significance in Reinforcement Learning Coherence in the brain unfolds across separable temporal regimes The Cost of Language: Centroid Erasure Exposes and Exploits Modal Competition in Multimodal Language Models DharmaOCR: Specialized Small Language Models for Structured OCR that outperform Open-Source and Commercial Baselines
EmoMind: Decoding Affective Captions from Human Brain fMRI
Bilal A. Mohammed, Lin Gu, Ruogo Fang · 2026-05-16 · via cs.CL updates on arXiv.org

Decoding visual experience from brain activity has advanced substantially, but cur- rent brain-to-text systems largely recover semantic content while discarding affect. Additionally, language models can generate emotional text when prompted with categorical labels, but such labels collapse rich inter-subject variability into coarse discrete bins. We present EmoMind, the first end-to-end pipeline for decoding affective captions directly from fMRI signals. EmoMind first retrieves a semanti- cally grounded neutral scene description from brain-decoded visual features, then rewrites it using a continuous 34-dimensional emotion vector decoded from the same fMRI recording. To control the balance between content preservation and affective expression, we train the rewriter with classifier-free guidance against an identity-preserving null branch, enabling smooth interpolation between semantic fidelity and affective expressivity. We evaluate affective caption generation with a three-axis validation framework spanning subject-specificity, structural geometry, and causal control. We further augment this framework with a synthetic-brain substitution test that probes robustness to the measurement apparatus, and we benchmark each axis against GPT-4 prompted with brain-decoded top-5 emotion labels as a strong discrete baseline. Across two independent emotion fMRI datasets, EmoMind significantly outperforms label-prompted GPT-4 on all three axes, with the largest gains on metrics that require person-specific affective structure rather than population-level emotion aggregation. These results establish continuous brain-decoded affect as a viable control signal for individualized affective cap- tion generation and open new directions for studying individual affective brain organisation.