





















Authors:Thomas A. Buckley, Riccardo Conci, Peter G. Brodeur, Jason Gusdorf, Sourik Beltrán, Bita Behrouzi, Byron Crowe, Jacob Dockterman, Muzzammil Muhammad, Sarah Ohnigian, Andrew Sanchez, James A. Diao, Aashna P. Shah, Daniel Restrepo, Eric S. Rosenberg, Andrew S. Lea, Emily Glanton, Kimberly LeBlanc, Undiagnosed Diseases Network, Marinka Zitnik, Scott H. Podolsky, Zahir Kanjee, Raja-Elie E. Abdulnour, Jacob M. Koshy, Adam Rodman, Arjun K. Manrai
Abstract:Differential diagnosis is an iterative process that integrates patient information with broader medical knowledge. Clinical case series such as the NEJM Clinicopathologic Conferences (CPCs), published continuously since 1923, feature expert physicians who demonstrate diagnostic reasoning to peers, and have been used for decades to evaluate AI. However, prior AI evaluations have largely focused on final diagnostic accuracy rather than nuanced clinical reasoning. Here, we introduce Dr. CaBot, an agentic AI system that emulates an expert diagnostician by generating written and narrated slide-based presentations from an initial case description alone. CaBot recently generated the first AI diagnosis published in the 100+ year history of the NEJM CPCs. In blinded evaluations, physicians misclassified the source of the differential (CaBot vs. physician-written) in 46/62 (74%) of trials and rated them favorably across quality dimensions. When tasked with solving cases for 72 patients with undiagnosed disease from the NIH Undiagnosed Diseases Network, CaBot identified the working diagnosis in 50/72 (69%) of cases from referral notes alone. To promote transparency and research, we also developed CPC-Bench, a physician-validated benchmark based on 7,102 CPCs and 47,648 questions across 10 tasks. We show that CaBot outperforms frontier models on CPC-Bench, and release both CaBot and CPC-Bench publicly to foster progress in clinical AI.
| Subjects: | Artificial Intelligence (cs.AI); Computer Vision and Pattern Recognition (cs.CV) |
| Cite as: | arXiv:2509.12194 [cs.AI] |
| (or arXiv:2509.12194v2 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2509.12194 arXiv-issued DOI via DataCite |
From: Arjun Manrai [view email]
[v1]
Mon, 15 Sep 2025 17:54:51 UTC (4,093 KB)
[v2]
Sun, 24 May 2026 19:16:03 UTC (3,654 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。