Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Knowledge Boosting: Rethinking Medical Contrastive Vision-Language Pre-Training

About

The foundation models based on pre-training technology have significantly advanced artificial intelligence from theoretical to practical applications. These models have facilitated the feasibility of computer-aided diagnosis for widespread use. Medical contrastive vision-language pre-training, which does not require human annotations, is an effective approach for guiding representation learning using description information in diagnostic reports. However, the effectiveness of pre-training is limited by the large-scale semantic overlap and shifting problems in medical field. To address these issues, we propose the Knowledge-Boosting Contrastive Vision-Language Pre-training framework (KoBo), which integrates clinical knowledge into the learning of vision-language semantic consistency. The framework uses an unbiased, open-set sample-wise knowledge representation to measure negative sample noise and supplement the correspondence between vision-language mutual information and clinical knowledge. Extensive experiments validate the effect of our framework on eight tasks including classification, segmentation, retrieval, and semantic relatedness, achieving comparable or better performance with the zero-shot or few-shot settings. Our code is open on https://github.com/ChenXiaoFei-CS/KoBo.

Xiaofei Chen, Yuting He, Cheng Xue, Rongjun Ge, Shuo Li, Guanyu Yang• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationSIIM
Dice Coefficient (%)65.54
96
ClassificationCheXpert
AUC0.866
25
Language Semantic RelatednessUMNSRS
Pearson Correlation0.2563
7
Language Semantic RelatednessMIMIC
Pearson Correlation0.4229
7
Vision and Language ClassificationCheXpert
AUROC0.8635
7
Vision ClassificationCovidx
Accuracy96.25
7
Vision and Language RetrievalMIMIC
mAP84.67
7
Vision RetrievalCheXpert 5X200
mAP41.23
7
Showing 8 of 8 rows

Other info

Code

Follow for update