Reproducing and Improving CheXNet: Deep Learning for Chest X-ray Disease Classification

About

Deep learning for radiologic image analysis is a rapidly growing field in biomedical research and is likely to become a standard practice in modern medicine. On the publicly available NIH ChestX-ray14 dataset, containing X-ray images that are classified by the presence or absence of 14 different diseases, we reproduced an algorithm known as CheXNet, as well as explored other algorithms that outperform CheXNet's baseline metrics. Model performance was primarily evaluated using the F1 score and AUC-ROC, both of which are critical metrics for imbalanced, multi-label classification tasks in medical imaging. The best model achieved an average AUC-ROC score of 0.85 and an average F1 score of 0.39 across all 14 disease classifications present in the dataset.

Daniel J. Strick, Carlos Garcia, Anthony Huang, Thomas Gardos• 2025

Related benchmarks

Task	Dataset	Result	Rank
Thoracic Disease Classification	NIH ChestX-ray14 (test)	--		44
Multi-label Chest X-ray Classification	NIH ChestX-ray14 (test)	AUC85.27		4

Showing 2 of 2 rows

Other info

Code

Follow for update

@wizwand_team Discord