惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

F
Full Disclosure
Recorded Future
Recorded Future
T
Tenable Blog
S
Securelist
C
CERT Recently Published Vulnerability Notes
T
Threatpost
S
Schneier on Security
A
Arctic Wolf
The Hacker News
The Hacker News
C
CXSECURITY Database RSS Feed - CXSecurity.com
Know Your Adversary
Know Your Adversary
P
Privacy International News Feed
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
The Register - Security
The Register - Security
Cisco Talos Blog
Cisco Talos Blog
AWS News Blog
AWS News Blog
K
Kaspersky official blog
T
True Tiger Recordings
T
Threat Research - Cisco Blogs
V
Vulnerabilities – Threatpost
P
Palo Alto Networks Blog
T
The Exploit Database - CXSecurity.com
小众软件
小众软件
B
Blog
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
Microsoft Azure Blog
Microsoft Azure Blog
Cyberwarzone
Cyberwarzone
C
Cybersecurity and Infrastructure Security Agency CISA
T
Tor Project blog
Spread Privacy
Spread Privacy
Malwarebytes
Malwarebytes
P
Proofpoint News Feed
F
Fox-IT International blog
F
Fortinet All Blogs
P
Privacy & Cybersecurity Law Blog
G
GRAHAM CLULEY
量子位
Latest news
Latest news
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
博客园 - 叶小钗
Project Zero
Project Zero
T
Tailwind CSS Blog
N
Netflix TechBlog - Medium
Martin Fowler
Martin Fowler
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
I
Intezer
博客园_首页
腾讯CDC
H
Hackread – Cybersecurity News, Data Breaches, AI and More
D
Darknet – Hacking Tools, Hacker News & Cyber Security

cs.CL updates on arXiv.org

Stop Listening to Me! How Multi-turn Conversations Can Degrade LLM Reliability SWE-Adept: An LLM-Based Agentic Framework for Deep Codebase Analysis and Structured Issue Resolution The Need for an External Observer Formalizing the Sufficiency Gap: A Mathematical Extension of Mixture Identifiability and Contextual Grounding in Sequence Models MULTISEISMO: A Multimodal Seismic Dataset and Model for Cross-Modal Seismic Understanding The Labyrinth and the Thread: Rethinking Regularizations in Sequential Knowledge Editing for Large Language Models LATTE: Forecasting Peer Anchored Preference Trajectories for Personalized LLM Generation ScientistOne: Towards Human-Level Autonomous Research via Chain-of-Evidence Representation-Aware Unlearning via Activation Signatures: From Suppression to Entity-Signature Erasure Conceptual Steganography Evi-Steer: Learning to Steer Biomedical Vision-Language Models through Efficient and Generalizable Evidential Tuning Alignment Tampering: How Reinforcement Learning from Human Feedback Is Exploited to Optimize Misaligned Biases GUI-Libra: Training Native GUI Agents to Reason and Act with Action-aware Supervision and Partially Verifiable RL VERA-V: Variational Inference Framework for Jailbreaking Vision-Language Models Athena: Enhancing Multimodal Reasoning with Data-efficient Process Reward Models The Strongest Teacher Is Not Always the Best Teacher: Student-Centric Answer Selection Probing Minimalist Phase Structure in LLMs: What Universal Dependencies Cannot Represent Slide Deck Q&A Quality Assurance App: A Multi-Stage Pipeline for Pedagogical Question Generation Learning GUI Grounding with Spatial Reasoning from Visual Feedback Hubness, Not Anisotropy, Drives Cross-Lingual Retrieval Asymmetry in Multilingual Embedding Models Elias in the Lighthouse, Again? Diagnosing Low Diversity in LLM Stories LaRe: Latent Refocusing for Multimodal Reasoning Evidence Absence Is Not Evidence Insufficiency: Diagnosing NEI Construction Artifacts in Fact Verification QAM-W: Joint 2D Codebook Quantization for LLM Weights via Hadamard Rotation and Activation-Aware Scaling Alignment Tuning for Large Language Models: A Data-Centric Lens on Alignment Data Pipelines SPEAR: Code-Augmented Agentic Prompt Optimization Beyond Transfer Accuracy: Faithful Circuits for Controlled Low-Resource Adaptation Beyond Binary: Speech Representations Across the Cognitive Score Hierarchy Energy-Gated Attention and Wavelet Positional Encoding: Complementary Inductive Biases for Transformer Attention Tracing Computation Density in LLMs UltraCUA: A Foundation Model for Computer Use Agents with Hybrid Action OCR-Reasoning Benchmark: Unveiling the True Capabilities of MLLMs in Complex Text-Rich Image Reasoning CroCo: Cross-Lingual Contrastive Preference Tuning on Self-Generations OmniInteract: Benchmarking Real-World Streaming Interaction for Real-Time Omnimodal Assistants From Snippets to Semantics: Rethinking Evidence Granularity for Multilingual Fact Verification Rethinking the Multilingual Reasoning Gap with Layer Swap Pop-Up Distractions Reveal Bag-of-Events Behavior in Video Large Language Models Vectors Are Not Neutral: Sensitive-Information Inference from Exported LLM Representations in Summarization Model Unlearning Objectives Vary for Distinct Language Functions Memory Architectures for Multi-Turn Text-to-SQL: A Benchmark and Empirical Study SetupX: Can LLM Agents Learn from Past Failures in Functionality-Correct Code Repository Setup? Reliable Extraction of Clinical Follow-Up Instructions: A Hybrid Neural-Symbolic Pipeline MobileMoE: Scaling On-Device Mixture of Experts MONA: Muon Optimizer with Nesterov Acceleration for Scalable Language Model Training UnityMAS-O: A General RL Optimization Framework for LLM-Based Multi-Agent Systems LEC: Linear Expectation Constraints for Selection-Conditioned Risk Control in Selective Prediction and Routing Systems FalAR: A Large-scale Speaker-Annotated European Portuguese Speech Corpus of Parliamentary Sessions The MiniMax-M2 Series: Mini Activations Unleashing Max Real-World Intelligence PRISM: A Multi-Dimensional Benchmark for Evaluating LLM Peer Reviewers Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems Dissecting Multimodal In-Context Learning: Modality Asymmetries and Circuit Dynamics in modern Transformers Strategic Persuasion with Trait-Conditioned Multi-Agent Systems for Iterative Legal Argumentation Curation and Extraction of Drug-Related Entities from Reddit Platform GraphDancer: Training LLMs to Explore and Reason over Graphs via Two-Stage Curriculum Post-Training Tool-Schema Compression Enables Agentic RAG Under Constrained Context Budgets NestedKV: Nested Memory Routing for Long-Context KV Cache Compression SOLE-R1: Video-Language Reasoning as the Sole Reward for On-Robot Reinforcement Learning Learning When to Think While Listening in Large Audio-Language Models PinPoint: Prompting with Informative Interior Points Rethinking the Trust Region in LLM Reinforcement Learning HiSpec: Hierarchical Speculative Decoding for LLMs MUSE-Autoskill: Self-Evolving Agents via Skill Creation, Memory, Management, and Evaluation Reasoning, Code, or Both? How Large Language Models Handle Variations in Math Questions LiPUP-MA: A Residential Experience-centric Multi-Agent Framework for Living-in-the-loop Participatory Urban Planning The Daily Dose: Workflow-Integrated Large Language Model Automation for Clinical Summarization and Trial Identification in Radiation Oncology An In-Vitro Study on Cross-Lingual Generalization in Language Models A Universal Cliff and a Design Fingerprint: Cross-Section Defect Detection Under LLM Orchestration MicroSpec: Accelerating Speculative Decoding with Lightweight In-Context Vocabularies Towards Just-in-Time Adaptive Feedback: Enhancing Student Learning via Knowledge-Grounded LLM FAB-Bench: A Framework for Adaptive RAG Benchmarking in Semiconductor Manufacturing RICE-PO: Turning Retrieval Interactions into Credit Signals for Reasoning Agents In-Context Optimization for Retrieval-Augmented Generation: A Gradient-Descent Perspective Annotator Positionality as Signal: Psychometric Weighting for Anti-Autistic Ableism Detection LURE: Live-Usage Replay Evaluations for Reducing Evaluation Awareness Why LLMs Hallucinate on Structured Knowledge: A Mechanistic Analysis of Reasoning over Linearized Representations AI evaluation may bias perceptions: The importance of context in interpreting academic writing Self-Verified Distillation: Your Language Model Is Secretly Its Own Synthetic Data Pipeline Granuscore: A Reference-Free Measure of Granularity for Text Analysis and Question Answering Not All Tokens Matter Equally: Dynamic In-context Vector Distillation with Decisive-Token Supervision for Long-form Medical Report Generation Towards Error-Free EHRs: Reasoning-Intensive Consistency Verification Between Clinical Notes and Structured Tables in Electronic Health Records Verus-SpecGym: An Agentic Environment for Evaluating Specification Autoformalization Chartographer: Counterfactual Chart Generation for Evaluating Vision-Language Models Verilog-Evolve: Feedback-Driven and Skill-Evolving Verilog Generation Why Prompt Optimization Works, and Why It Sometimes Doesn't: A Causal-Inspired Edit-Level Analysis Pretraining Data Exposure in Large Language Models: A Survey of Membership Inference, Data Contamination, and Security Implications LLMs Are Already Good Tutors: Training-Free Prompt Optimization for Pedagogical Math Tutoring BhashaSetu: A Data-Centric Approach to Low-Resource Machine Translation Advancing Creative Physical Intelligence in Large Multimodal Models Probing the Knowledge Boundary: An Interactive Agentic Framework for Deep Knowledge Extraction Evaluating the Relevance of Uncertainty Estimators for LLM Hallucination It's Not Always Sycophancy: Measuring LLM Conformity as a Function of Epistemic Uncertainty Latent Recurrent Transformer: Architecture Exploration, Training Strategies, and Scaling Behavior KARMA: Karma-Aligned Reward Model Adaptation Guiding LLM Post-training Data Engineering with Model Internals from Sparse Autoencoders It's Not the Capability: Harness Sensitivity Is Non-Monotone Across LLM Agent Tiers A Hybrid Vision-Language Architecture for Automated Defect Reasoning and Report Generation in Industrial Inspection Omanic: Towards Step-wise Evaluation of Multi-hop Reasoning in Large Language Models Bounded Path Context: A Controlled Study of Visible Path History in LLM-Based Knowledge Graph Question Answering Cultural Value Alignment Via Latent Activation Steering in Large Language Models Conv-to-Bench: Evaluating Language Models Via User-Assistant Dialogues In Code Tasks Targeted Remasking: Replacing Token Editing with Token-to-Mask Refinement in Discrete Diffusion Language Models
Med-CoReasoner: Reducing Language Disparities in Medical Reasoning via Language-Informed Co-Reasoning
Fan Gao, She · 2026-05-27 · via cs.CL updates on arXiv.org

View PDF HTML (experimental)

Abstract:While reasoning-enhanced large language models perform strongly on English medical tasks, a persistent multilingual gap remains, with substantially weaker reasoning in local languages, limiting equitable global medical deployment. To bridge this gap, we introduce Med-CoReasoner, a language-informed co-reasoning framework that elicits parallel English and local-language reasoning, abstracts them into structured concepts, and integrates local clinical knowledge into an English logical scaffold via concept-level alignment and retrieval. This design combines the structural robustness of English reasoning with the practice-grounded expertise encoded in local languages. To evaluate multilingual medical reasoning beyond multiple-choice settings, we construct MultiMed-X, a benchmark covering seven languages with expert-annotated long-form question answering and natural language inference tasks, comprising 350 instances per language. Experiments across three benchmarks show that Med-CoReasoner improves multilingual reasoning performance by an average of 5%, with particularly substantial gains in low-resource languages. Moreover, model distillation and expert evaluation analysis further confirm that Med-CoReasoner produces clinically sound and culturally grounded reasoning traces.
Subjects: Computation and Language (cs.CL)
Cite as: arXiv:2601.08267 [cs.CL]
  (or arXiv:2601.08267v3 [cs.CL] for this version)
  https://doi.org/10.48550/arXiv.2601.08267

arXiv-issued DOI via DataCite

Submission history

From: Fan Gao [view email]
[v1] Tue, 13 Jan 2026 06:51:40 UTC (2,062 KB)
[v2] Mon, 19 Jan 2026 09:20:16 UTC (2,062 KB)
[v3] Tue, 26 May 2026 03:21:12 UTC (1,567 KB)