惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

酷 壳 – CoolShell
酷 壳 – CoolShell
H
Hacker News: Front Page
P
Palo Alto Networks Blog
T
ThreatConnect
Apple Machine Learning Research
Apple Machine Learning Research
博客园_首页
T
True Tiger Recordings
P
Privacy & Cybersecurity Law Blog
B
Blog
IT之家
IT之家
Last Week in AI
Last Week in AI
F
Full Disclosure
Hacker News: Ask HN
Hacker News: Ask HN
C
Comments on: Blog
Microsoft Azure Blog
Microsoft Azure Blog
C
Cybersecurity and Infrastructure Security Agency CISA
Microsoft Security Blog
Microsoft Security Blog
博客园 - 【当耐特】
N
News and Events Feed by Topic
NISL@THU
NISL@THU
腾讯CDC
雷峰网
雷峰网
Security Latest
Security Latest
李成银的技术随笔
M
Microsoft Research Blog - Microsoft Research
L
LangChain Blog
L
Lohrmann on Cybersecurity
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
C
Check Point Blog
Y
Y Combinator Blog
Recent Announcements
Recent Announcements
博客园 - Franky
N
News | PayPal Newsroom
V
V2EX
A
About on SuperTechFans
The Register - Security
The Register - Security
月光博客
月光博客
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Google Online Security Blog
Google Online Security Blog
MyScale Blog
MyScale Blog
Cisco Talos Blog
Cisco Talos Blog
Vercel News
Vercel News
WordPress大学
WordPress大学
C
Cyber Attacks, Cyber Crime and Cyber Security
The Hacker News
The Hacker News
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
IntelliJ IDEA : IntelliJ IDEA – the Leading IDE for Professional Development in Java and Kotlin | The JetBrains Blog
爱范儿
爱范儿
A
Arctic Wolf
L
LINUX DO - 最新话题
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More

cs.CL updates on arXiv.org

SMoA: Spectrum Modulation Adapter for Parameter-Efficient Fine-Tuning MedicalBench: Evaluating Large Language Models Toward Improved Medical Concept Extraction Mix-Quant: Quantized Prefilling, Precise Decoding for Agentic LLMs Cross-lingual robustness of LLM-brain alignment and its computational roots ACL-Verbatim: hallucination-free question answering for research Smarter edits? Post-editing with error highlights and translation suggestions Distributional Alignment as a Criterion for Designing Task Vectors in In-Context Learning TextReg: Mitigating Prompt Distributional Overfitting via Regularized Text-Space Optimization Refining and Reusing Annotation Guidelines for LLM Annotation NeuroQA: A Large-Scale Image-Grounded Benchmark for 3D Brain MRI Understanding You Only Need Minimal RLVR Training: Extrapolating LLMs via Rank-1 Trajectories GraphRAG on Consumer Hardware: Benchmarking Local LLMs for Healthcare EHR Schema Retrieval AgentAtlas: Beyond Outcome Leaderboards for LLM Agents Beyond Semantic Similarity: A Two-Phase Non-Parametric Retrieval Workflow for Corporate Credit Underwriting AFD-INSTRUCTION: A Comprehensive Antibody Instruction Dataset with Functional Annotations for LLM-Based Understanding and Design Do as I Say, Not as I Do: Instruction-Induction Conflict in LLMs Text Analytics Evaluation Framework: A Case Study on LLMs and Social Media LLMs on the Line: Data Determines Loss-to-Loss Scaling Laws Terminal-World: Scaling Terminal-Agent Environments via Agent Skills DASH: Fast Differentiable Architecture Search for Hybrid Attention in Minutes on a Single GPU Bayesian Preference Learning for Test-Time Steerable Reward Models Self-Training Doesn't Flatten Language -- It Restructures It: Surface Markers Amplify While Deep Syntax Dies The Hidden Signal of Verifier Strictness: Controlling and Improving Step-Wise Verification via Selective Latent Steering Thinking-while-speaking: A Controlled, Interleaved Reasoning Method for Real-Time Speech Generation Divide-Prompt-Refine: a Training-Free, Structure-Aware Framework for Biomedical Abstract Generation Direct Translation between Sign Languages Pseudo-Siamese Network for Planning in Target-Oriented Proactive Dialogues LamPO: A Lambda Style Policy Optimization for Reasoning Language Models Reinforcing Human Behavior Simulation via Verbal Feedback Geometry-Lite: Interpretable Safety Probing via Layer-Wise Margin Geometry Automated ICD Classification of Psychiatric Diagnoses: From Classical NLP to Large Language Models FlowLM: Few-Step Language Modeling via Diffusion-to-Flow Adaptation Auto-Dreamer: Learning Offline Memory Consolidation for Language Agents WCXB: A Multi-Type Web Content Extraction Benchmark GradeLegal: Automated Grading for German Legal Cases AiraXiv: An AI-Driven Open-Access Platform for Human and AI Scientists APM: Evaluating Style Personalization in LLMs with Arbitrary Preference Mappings Assessing socio-economic climate impacts from text data MTR-Suite: A Framework for Evaluating and Synthesizing Conversational Retrieval Benchmarks Strategy-Induct: Task-Level Strategy Induction for Instruction Generation Improving Quantized Model Performance in Qualitative Analysis with Multi-Pass Prompt Verification AVSD: Adaptive-View Self-Distillation by Balancing Consensus and Teacher-Specific Privileged Signals Single-Pass, Depth-Selective Reading for Multi-Aspect Sentiment Analysis Shiny Stories, Hidden Struggles: Investigating the Representation of Disability Through the Lens of LLMs Calibration vs Decision Making: Revisiting the Reliability Paradox in Unlearned Language Models DEL: Digit Entropy Loss for Numerical Learning of Large Language Models Enhancing Scientific Discourse: Machine Translation for the Scientific Domain LASH: Adaptive Semantic Hybridization for Black-Box Jailbreaking of Large Language Models Puzzled By ChatGPT? No more! A Jigsaw Puzzle to Promote AI Literacy and Awareness Synchronization and Turn-Taking in Full-Duplex Speech Dialogue Models LoCar: Localization-Aware Evaluation of In-Vehicle Assistants through Fine-Grained Sociolinguistic Control Fine-grained Claim-level RAG Benchmark for Law DIVE: Embedding Compression via Self-Limiting Gradient Updates HRM-Text: Efficient Pretraining Beyond Scaling SymbolicLight V1: Spike-Gated Dual-Path Language Modeling with High Activation Sparsity and Sub-Billion-Scale Pre-Training Evidence Under Pressure: Emotional Framing Induces Measurable Behavioral Shifts and Structured Internal Geometry in Small Language Models Do No Harm? Hallucination and Actor-Level Abuse in Web-Deployed Medical Large Language Models Tracing the ongoing emergence of human-like reasoning in Large Language Models When Reasoning Supervision Hurts: TTCW-Based Long-Form Literary Review Generation Collocational bootstrapping: A hypothesis about the learning of subject-verb agreement in humans and neural networks SCRIBE: Diagnostic Evaluation and Rich Transcription Models for Indic ASR Distribution-Aware Reward: Reinforcement Learning over Predictive Distributions for LLM Regression Findings of the Counter Turing Test: AI-Generated Text Detection What Do Biomedical NER and Entity Linking Benchmarks Measure? A Corpus-Centric Diagnostic Framework Metaphors in Literary Post-Editing: Opening Pandora's Box? Mechanics of Bias and Reasoning: Interpreting the Impact of Chain-of-Thought Prompting on Gender Bias in LLMs Reliable Automated Triage in Spanish Clinical Notes: A Hybrid Framework for Risk-Aware HIV Suspicion Identification Retrieval-Augmented Long-Context Translation for Cultural Image Captioning: Gators submission for AmericasNLP 2026 shared task "I didn't Make the Micro Decisions": Measuring, Inducing, and Exposing Goal-Level AI Contributions in Collaboration When Irregularity Helps: A Subclass Analysis of Inductive Bias in Neural Morphology Do LLMs Know What Luxembourgish Borrows? Probing Lexical Neology in Low-Resource Multilingual Models Towards Context-Invariant Safety Alignment for Large Language Models Interpretable Discriminative Text Representations via Agreement and Label Disentanglement Evaluating Speech Articulation Synthesis with Articulatory Phoneme Recognition Memory Grafting: Scaling Language Model Pre-training via Offline Conditional Memory CP-MoE: Consistency-Preserving Mixture-of-Experts for Continual Learning Beyond Text-to-SQL: An Agentic LLM System for Governed Enterprise Analytics APIs Building Arabic NLP from the Ground Up: Twenty Years of Lessons, Failures, and Open Problems Task-Routed Mixture-of-Experts with Cognitive Appraisal for Implicit Sentiment Analysis Most Transformer Modifications Still Do Not Transfer at 1-3B: A 2020-2026 Update to Narang et al. (2021) with Downstream Evaluation and a Noise Floor PulseCol: Periodically Refreshed Column-Sparse Attention for Accelerating Diffusion Language Models DelTA: Discriminative Token Credit Assignment for Reinforcement Learning from Verifiable Rewards Data Scaling as Progressive Coverage of a Predictive Contribution Spectrum Parallel LLM Reasoning for Bias-Resilient, Robust Conceptual Abstraction ChunkFT: Byte-Streamed Optimization for Memory-Efficient Full Fine-Tuning Training Language Agents to Learn from Experience On the limits and opportunities of AI reviewers: Reviewing the reviews of Nature-family papers with 45 expert scientists Multi-agent Collaboration with State Management Lean Refactor: Multi-Objective Controllable Proof Optimization via Agentic Strategy Search The Illusion of Intervention: Your LLM-Simulated Experiment is an Observational Study Gated Normalization Removal and Scale Anchoring in Pre-Norm Transformers Playing Devil's Advocate: Off-the-Shelf Persona Vectors Rival Targeted Steering for Sycophancy Building a Custom Taxonomy of AI Skills and Tasks from the Ground Up with Job Postings Long-Context Reasoning Through Proxy-Based Chain-of-Thought Tuning Chronicle: A Multimodal Foundation Model for Joint Language and Time Series Understanding Stage-Audit: Auditable Source-Frontier Discovery for Cross-Wiki Tables Manga109-v2026: Revisiting Manga109 Annotations for Modern Manga Understanding ArPoMeme: An Annotated Arabic Multimodal Dataset for Political Ideology and Polarization JobArabi: An Arabic Corpus and Analysis of Job Announcements from Social Media MemGym: a Long-Horizon Memory Environment for LLM Agents
Rethinking Patient Education as Multi-turn Multi-modal Interaction
Zonghai Yao, · 2026-04-17 · via cs.CL updates on arXiv.org

View PDF

Abstract:Most medical multimodal benchmarks focus on static tasks such as image question answering, report generation, and plain-language rewriting. Patient education is more demanding: systems must identify relevant evidence across images, show patients where to look, explain findings in accessible language, and handle confusion or distress. Yet most patient education work remains text-only, even though combined image-and-text explanations may better support understanding. We introduce MedImageEdu, a benchmark for multi-turn, evidence-grounded radiology patient education. Each case provides a radiology report with report text and case images. A DoctorAgent interacts with a PatientAgent, conditioned on a hidden profile that captures factors such as education level, health literacy, and personality. When a patient question would benefit from visual support, the DoctorAgent can issue drawing instructions grounded in the report, case images, and the current question to a benchmark-provided drawing tool. The tool returns image(s), after which the DoctorAgent produces a final multimodal response consisting of the image(s) and a grounded plain-language explanation. MedImageEdu contains 150 cases from three sources and evaluates both the consultation process and the final multimodal response along five dimensions: Consultation, Safety and Scope, Language Quality, Drawing Quality, and Image-Text Response Quality. Across representative open- and closed-source vision-language model agents, we find three consistent gaps: fluent language often outpaces faithful visual grounding, safety is the weakest dimension across disease categories, and emotionally tense interactions are harder than low education or low health literacy. MedImageEdu provides a controlled testbed for assessing whether multimodal agents can teach from evidence rather than merely answer from text.
Comments: Equal contribution for the first two authors
Subjects: Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2604.14656 [cs.AI]
  (or arXiv:2604.14656v1 [cs.AI] for this version)
  https://doi.org/10.48550/arXiv.2604.14656

arXiv-issued DOI via DataCite

Submission history

From: Zonghai Yao [view email]
[v1] Thu, 16 Apr 2026 06:06:50 UTC (9,495 KB)