A Vision-Language Foundation Model to Enhance Efficiency of Chest X-ray Interpretation

About

Over 1.4 billion chest X-rays (CXRs) are performed annually due to their cost-effectiveness as an initial diagnostic test. This scale of radiological studies provides a significant opportunity to streamline CXR interpretation and documentation. While foundation models are a promising solution, the lack of publicly available large-scale datasets and benchmarks inhibits their iterative development and real-world evaluation. To overcome these challenges, we constructed a large-scale dataset (CheXinstruct), which we utilized to train a vision-language foundation model (CheXagent). We systematically demonstrated competitive performance across eight distinct task types on our novel evaluation benchmark (CheXbench). Beyond technical validation, we assessed the real-world utility of CheXagent in directly drafting radiology reports. Our clinical assessment with eight radiologists revealed a 36% time saving for residents using CheXagent-drafted reports, while attending radiologists showed no significant time difference editing resident-drafted or CheXagent-drafted reports. The CheXagent-drafted reports improved the writing efficiency of both radiology residents and attending radiologists in 81% and 61% of cases, respectively, without loss of quality. Overall, we demonstrate that CheXagent can effectively perform a variety of CXR interpretation tasks and holds potential to assist radiologists in routine clinical workflows.

Zhihong Chen, Maya Varma, Justin Xu, Magdalini Paschali, Dave Van Veen, Andrew Johnston, Alaa Youssef, Louis Blankemeier, Christian Bluethgen, Stephan Altmayer, Jeya Maria Jose Valanarasu, Mohamed Siddig Eltayeb Muneer, Eduardo Pontes Reis, Joseph Paul Cohen, Cameron Olsen, Tanishq Mathew Abraham, Emily B. Tsai, Christopher F. Beaulieu, Jenia Jitsev, Sergios Gatidis, Jean-Benoit Delbrouck, Akshay S. Chaudhari, Curtis P. Langlotz• 2024

Related benchmarks

Task	Dataset	Result
Radiology Report Generation	MIMIC-CXR (test)	ROUGE-L30.28	209
Radiology Report Generation	IU-Xray (test)	--	110
Radiology Report Generation	CheXpert Plus (test)	Precision0.373	88
Image Classification	Covidx	Accuracy34.3	57
Medical Image Classification	RSNA	AUC78.9	48
Visual Question Answering	Chest X-ray VQA (test)	Overall Accuracy47.41	43
Medical Report Generation	MIMIC-CXR	F1 Score31.95	34
Disease Classification	CheXpert	AUROC0.813	23
Binary disease diagnosis	MIMIC-CXR	Macro Accuracy80.8	21
Binary disease diagnosis	CheXpert (test)	Macro Acc78.6	21

Showing 10 of 83 rows

...

Other info

Follow for update

@wizwand_team Discord