Generalized Radiograph Representation Learning via Cross-supervision between Images and Free-text Radiology Reports

About

Pre-training lays the foundation for recent successes in radiograph analysis supported by deep learning. It learns transferable image representations by conducting large-scale fully-supervised or self-supervised learning on a source domain. However, supervised pre-training requires a complex and labor intensive two-stage human-assisted annotation process while self-supervised learning cannot compete with the supervised paradigm. To tackle these issues, we propose a cross-supervised methodology named REviewing FreE-text Reports for Supervision (REFERS), which acquires free supervision signals from original radiology reports accompanying the radiographs. The proposed approach employs a vision transformer and is designed to learn joint representations from multiple views within every patient study. REFERS outperforms its transfer learning and self-supervised learning counterparts on 4 well-known X-ray datasets under extremely limited supervision. Moreover, REFERS even surpasses methods based on a source domain of radiographs with human-assisted structured labels. Thus REFERS has the potential to replace canonical pre-training methodologies.

Hong-Yu Zhou, Xiaoyu Chen, Yinghao Zhang, Ruibang Luo, Liansheng Wang, Yizhou Yu• 2021

Related benchmarks

Task	Dataset	Result
Chest X-ray classification	NIH (test)	AUROC83.7	47
Classification	RSNA (test)	F1 Score67.2	44
Image Classification	SIIM (test)	F1 Score72.6	30
Image Classification	CheXpert 5x200 (test)	Accuracy41.8	19
Image-omic Classification	TCGA Lung Cancer	Accuracy93.68	14
Lesion Segmentation	RSNA 56	Dice Score75.52	12
Lesion Segmentation	TBX11K 42	Dice86.39	12
Text-Image Retrieval	MIMIC-CXR 5x200	mAP@160.6	9
Image Classification	MIMIC 5x200 (test)	Accuracy49.5	9
Image-Text Retrieval	MIMIC-CXR 5x200	mAP@152.4	9

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord