Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Knowledge-enhanced Visual-Language Pre-training on Chest Radiology Images

About

While multi-modal foundation models pre-trained on large-scale data have been successful in natural language understanding and vision recognition, their use in medical domains is still limited due to the fine-grained nature of medical tasks and the high demand for domain knowledge. To address this challenge, we propose a novel approach called Knowledge-enhanced Auto Diagnosis (KAD) which leverages existing medical domain knowledge to guide vision-language pre-training using paired chest X-rays and radiology reports. We evaluate KAD on {four} external X-ray datasets and demonstrate that its zero-shot performance is not only comparable to that of fully-supervised models, but also superior to the average of three expert radiologists for three (out of five) pathologies with statistical significance. Moreover, when few-shot annotation is available, KAD outperforms all existing approaches in fine-tuning settings, demonstrating its potential for application in different clinical scenarios.

Xiaoman Zhang, Chaoyi Wu, Ya Zhang, Yanfeng Wang, Weidi Xie• 2023

Related benchmarks

TaskDatasetResultRank
Object DetectionRSNA
mAP (%)18.1
106
Multi-Label ClassificationChestX-Ray14 (test)
AUROC (%)82.5
88
Image ClassificationCXR14
AUC0.789
76
Image ClassificationRSNA (test)
AUC66.75
59
Medical Semantic SegmentationSIIM Pneumothorax
Dice Score45.17
46
Image ClassificationSIIM (test)--
30
Image ClassificationCheXpert 5x200 (test)
Accuracy23.5
19
Medical Image ClassificationMIDRC-XR Portable
AUC93.41
18
Medical Image ClassificationMIDRC-XR
AUC85.74
18
Multi-label CXR ClassificationOpen-i (test)
AUC0.807
8
Showing 10 of 17 rows

Other info

Follow for update