惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

T
Tenable Blog
AI
AI
The Hacker News
The Hacker News
T
Threatpost
S
Securelist
V
Vulnerabilities – Threatpost
I
Intezer
C
CERT Recently Published Vulnerability Notes
P
Privacy & Cybersecurity Law Blog
N
News and Events Feed by Topic
P
Palo Alto Networks Blog
C
Cybersecurity and Infrastructure Security Agency CISA
A
Arctic Wolf
Latest news
Latest news
Cisco Talos Blog
Cisco Talos Blog
Security Archives - TechRepublic
Security Archives - TechRepublic
Project Zero
Project Zero
Google DeepMind News
Google DeepMind News
Schneier on Security
Schneier on Security
N
News | PayPal Newsroom
H
Hacker News: Front Page
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
CXSECURITY Database RSS Feed - CXSecurity.com
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Google Online Security Blog
Google Online Security Blog
K
Kaspersky official blog
OSCHINA 社区最新新闻
OSCHINA 社区最新新闻
Engineering at Meta
Engineering at Meta
WordPress大学
WordPress大学
V2EX - 技术
V2EX - 技术
S
Security Affairs
博客园 - Franky
L
LangChain Blog
Recent Announcements
Recent Announcements
Cyber Security Advisories - MS-ISAC
Cyber Security Advisories - MS-ISAC
G
GRAHAM CLULEY
博客园 - 聂微东
T
Tor Project blog
S
Secure Thoughts
Application and Cybersecurity Blog
Application and Cybersecurity Blog
Martin Fowler
Martin Fowler
P
Proofpoint News Feed
Help Net Security
Help Net Security
Hacker News: Ask HN
Hacker News: Ask HN
酷 壳 – CoolShell
酷 壳 – CoolShell
L
LINUX DO - 最新话题
B
Blog RSS Feed
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
D
Darknet – Hacking Tools, Hacker News & Cyber Security

cs.LG updates on arXiv.org

Memory-Guided Trust-Region Bayesian Optimization (MG-TuRBO) for High Dimensions EngageTriBoost: Predictive Modeling of User Engagement in Digital Mental Health Intervention Using Explainable Machine Learning Reservoir observer enhanced with residual calibration and attention mechanism Efficient RL Training for LLMs with Experience Replay Wireless Communication Enhanced Value Decomposition for Multi-Agent Reinforcement Learning Adversarial Sensor Errors for Safe and Robust Wind Turbine Fleet Control IKKA: Inversion Classification via Critical Anomalies for Robust Visual Servoing Adaptive Simulation Experiment for LLM Policy Optimization EvoLen: Evolution-Guided Tokenization for DNA Language Model Smartwatch-Based Sitting Time Estimation in Real-World Office Settings Structural Evaluation Metrics for SVG Generation via Leave-One-Out Analysis Loom: A Scalable Analytical Neural Computer Architecture Spectral Geometry of LoRA Adapters Encodes Training Objective and Predicts Harmful Compliance Finite-Sample Analysis of Nonlinear Independent Component Analysis:Sample Complexity and Identifiability Bounds How does Chain of Thought decompose complex tasks? Uncertainty-Aware Transformers: Conformal Prediction for Language Models Adaptive Candidate Point Thompson Sampling for High-Dimensional Bayesian Optimization Using Synthetic Data for Machine Learning-based Childhood Vaccination Prediction in Narok, Kenya Delve into the Applicability of Advanced Optimizers for Multi-Task Learning Bridging SFT and RL: Dynamic Policy Optimization for Robust Reasoning Multi-Agent Decision-Focused Learning via Value-Aware Sequential Communication Predictive Entropy Links Calibration and Paraphrase Sensitivity in Medical Vision-Language Models Efficient Hierarchical Implicit Flow Q-learning for Offline Goal-conditioned Reinforcement Learning Modality-Aware Zero-Shot Pruning and Sparse Attention for Efficient Multimodal Edge Inference The nextAI Solution to the NeurIPS 2023 LLM Efficiency Challenge Feature-Label Modal Alignment for Robust Partial Multi-Label Learning Integrated electro-optic attention nonlinearities for transformers Toward World Models for Epidemiology Tracing the Chain: Deep Learning for Stepping-Stone Intrusion Detection Policy-Aware Design of Large-Scale Factorial Experiments Beyond Augmented-Action Surrogates for Multi-Expert Learning-to-Defer Continuous Orthogonal Mode Decomposition: Haptic Signal Prediction in Tactile Internet Batch Distillation Data for Developing Machine Learning Anomaly Detection Methods Predicting Metabolic Dysfunction-Associated Steatotic Liver Disease using Machine Learning Methods: A Retrospective Cohort Study Adaptive Tuning of Parameterized Traffic Controllers via Multi-Agent Reinforcement Learning Bandwidth-constrained Variational Message Encoding for Cooperative Multi-agent Reinforcement Learning Neural Two-Stage Stochastic Optimization for Solving Unit Commitment Problem SafeAdapt: Provably Safe Policy Updates in Deep Reinforcement Learning Rays as Pixels: Learning A Joint Distribution of Videos and Camera Trajectories PhysInOne: Visual Physics Learning and Reasoning in One Suite FIRE-CIR: Fine-grained Reasoning for Composed Fashion Image Retrieval Detecting Diffusion-generated Images via Dynamic Assembly Forests PDE-regularized Dynamics-informed Diffusion with Uncertainty-aware Filtering for Long-Horizon Dynamics Leave My Images Alone: Preventing Multi-Modal Large Language Models from Analyzing Images via Visual Prompt Injection Regime-Conditional Retrieval: Theory and a Transferable Router for Two-Hop QA Identification and Anonymization of Named Entities in Unstructured Information Sources for Use in Social Engineering Detection Hypergraph Neural Networks Accelerate MUS Enumeration ASTRA: Adaptive Semantic Tree Reasoning Architecture for Complex Table Question Answering Neighbourhood Transformer: Switchable Attention for Monophily-Aware Graph Learning WOMBET: World Model-Based Experience Transfer for Robust and Sample-efficient Reinforcement Learning Low-Data Supervised Adaptation Outperforms Prompting for Cloud Segmentation Under Domain Shift Revisiting the Capacity Gap in Chain-of-Thought Distillation from a Practical Perspective A Mathematical Framework for Temporal Modeling and Counterfactual Policy Simulation of Student Dropout Temporal Dropout Risk in Learning Analytics: A Harmonized Survival Benchmark Across Dynamic and Early-Window Representations MedFormer-UR: Uncertainty-Routed Transformer for Medical Image Classification Dictionary-Aligned Concept Control for Safeguarding Multimodal LLMs Hierarchical Kernel Transformer: Multi-Scale Attention with an Information-Theoretic Approximation Analysis Post-Hoc Guidance for Consistency Models by Joint Flow Distribution Learning SenBen: Sensitive Scene Graphs for Explainable Content Moderation R2G: A Multi-View Circuit Graph Benchmark Suite from RTL to GDSII $p1$: Better Prompt Optimization with Fewer Prompts A Little Rank Goes a Long Way: Random Scaffolds with LoRA Adapters Are All You Need Deep Learning-Based Tracking and Lineage Reconstruction of Ligament Breakup Every Response Counts: Quantifying Uncertainty of LLM-based Multi-Agent Systems through Tensor Decomposition Unified Multimodal Uncertain Inference EfficientSign: An Attention-Enhanced Lightweight Architecture for Indian Sign Language Recognition Skip-Connected Policy Optimization for Implicit Advantage PRAGMA: Revolut Foundation Model 3D-VCD: Hallucination Mitigation in 3D-LLM Embodied Agents through Visual Contrastive Decoding Creator Incentives in Recommender Systems: A Cooperative Game-Theoretic Approach for Stable and Fair Collaboration in Multi-Agent Bandits Evidential Transformation Network: Turning Pretrained Models into Evidential Models for Post-hoc Uncertainty Estimation StructRL: Recovering Dynamic Programming Structure from Learning Dynamics in Distributional Reinforcement Learning Generative 3D Gaussian Splatting for Arbitrary-ResolutionAtmospheric Downscaling and Forecasting From Selection to Scheduling: Federated Geometry-Aware Correction Makes Exemplar Replay Work Better under Continual Dynamic Heterogeneity Needle in a Haystack: One-Class Representation Learning for Detecting Rare Malignant Cells in Computational Cytology Detection of Hate and Threat in Digital Forensics: A Case-Driven Multimodal Approach Semantic Intent Fragmentation: A Single-Shot Compositional Attack on Multi-Agent AI Pipelines Joint Interference Detection and Identification via Adversarial Multi-task Learning HaloProbe: Bayesian Detection and Mitigation of Object Hallucinations in Vision-Language Models R3PM-Net: Real-time, Robust, Real-world Point Matching Network From Dispersion to Attraction: Spectral Dynamics of Hallucination Across Whisper Model Scales AlphaLab: Autonomous Multi-Agent Research Across Optimization Domains with Frontier LLMs Act or Escalate? Evaluating Escalation Behavior in Automation with Language Models Multivariate Time Series Anomaly Detection via Dual-Branch Reconstruction and Autoregressive Flow-based Residual Density Estimation On the Spectral Geometry of Cross-Modal Representations: A Functional Map Diagnostic for Multimodal Alignment Structured Exploration and Exploitation of Label Functions for Automated Data Annotation MolPaQ: Modular Quantum-Classical Patch Learning for Interpretable Molecular Generation CausalVAD: De-confounding End-to-End Autonomous Driving via Causal Intervention Reinforcement-aware Knowledge Distillation for LLM Reasoning SubQuad: Near-Quadratic-Free Structure Inference with Distribution-Balanced Objectives in Adaptive Receptor framework A Horizon-Aware Decision-Support Framework for Demand Forecasting Model Selection in Resilient Production Planning Measurement-Consistent Langevin Corrector for Stabilizing Latent Diffusion Inverse Problem Solvers Multi-agent Adaptive Mechanism Design When & How to Write for Personalized Demand-aware Query Rewriting in Video Search Relational Visual Similarity From Navigation to Refinement: Revealing the Two-Stage Nature of Flow-based Diffusion Models through Oracle Velocity On-the-Fly Adaptation to Quantization: Configuration-Aware LoRA for Efficient Fine-Tuning of Quantized LLMs STCast: Adaptive Boundary Alignment for Global and Regional Weather Forecasting OmniPrism: Learning Disentangled Visual Concept for Image Generation FIT-GNN: Faster Inference Time for GNNs that 'FIT' in Memory Using Coarsening
RadSEM: A Finding-by-Finding Metric for Clinical Consistency in Radiology Reports
[Submitted on 3 Jun 2026] · 2026-06-17 · via cs.LG updates on arXiv.org

View PDF HTML (experimental)

Abstract:Radiology report evaluation must distinguish clinical compatibility from surface similarity, because negation, laterality, or normal-abnormal polarity can reverse a finding. We propose RadSEM (Radiology Sentence-Level Evaluation Metric), a constrained LLM-assisted metric for reference-based evaluation of radiology Findings. RadSEM rewrites reference and generated reports into ordered atomic finding sentences, each expressing one site-finding proposition. It then performs contradiction-constrained many-to-many matching: incompatible pairs such as "effusion" and "no effusion" receive no credit, while compatible granularity differences can receive partial credit. A deterministic stage weights pairs by part-whole and abnormal-detail relationships, counts unmatched findings, and produces an abnormal-focused weighted F1 score. Thus, the LLM supports structured rewriting and local alignment rather than acting as an opaque judge. We evaluate RadSEM with SSREE, a controlled monotonicity stress test built from 2,448 de-identified reports expanded into five graded corruption levels. RadSEM achieves Kendall tau_b of 0.957, all-pairs concordance of 97.8%, adjacent concordance of 95.0%, and strict five-level ordering for 81.9% of reports, outperforming radiology-specific and general text metrics while avoiding the failure in which polarity-inverted reports regain lexical overlap. On the same SSREE set, RadSEM outperforms the Ref-anchored RadSEM-Alt policy, improving adjacent concordance from 90.7% to 95.0% and strict ordering from 67.2% to 81.9%. On a 599-triplet synonym/antonym subset, RadSEM prefers synonyms in 597 cases (99.67%). These results suggest that explicit finding units, contradiction-aware matching, and abnormal-focused deterministic scoring make report scoring more interpretable and sensitive to clinically meaningful errors. Code is available at this https URL.

Submission history

From: Jun Zhao [view email]
[v1] Wed, 3 Jun 2026 13:59:42 UTC (4,675 KB)