RadGraph: Extracting Clinical Entities and Relations from Radiology Reports

About

Extracting structured clinical information from free-text radiology reports can enable the use of radiology report information for a variety of critical healthcare applications. In our work, we present RadGraph, a dataset of entities and relations in full-text chest X-ray radiology reports based on a novel information extraction schema we designed to structure radiology reports. We release a development dataset, which contains board-certified radiologist annotations for 500 radiology reports from the MIMIC-CXR dataset (14,579 entities and 10,889 relations), and a test dataset, which contains two independent sets of board-certified radiologist annotations for 100 radiology reports split equally across the MIMIC-CXR and CheXpert datasets. Using these datasets, we train and test a deep learning model, RadGraph Benchmark, that achieves a micro F1 of 0.82 and 0.73 on relation extraction on the MIMIC-CXR and CheXpert test sets respectively. Additionally, we release an inference dataset, which contains annotations automatically generated by RadGraph Benchmark across 220,763 MIMIC-CXR reports (around 6 million entities and 4 million relations) and 500 CheXpert reports (13,783 entities and 9,908 relations) with mappings to associated chest radiographs. Our freely available dataset can facilitate a wide range of research in medical natural language processing, as well as computer vision and multi-modal learning when linked to chest radiographs.

Saahil Jain, Ashwin Agrawal, Adriel Saporta, Steven QH Truong, Du Nguyen Duong, Tan Bui, Pierre Chambon, Yuhao Zhang, Matthew P. Lungren, Andrew Y. Ng, Curtis P. Langlotz, Pranav Rajpurkar• 2021

Related benchmarks

Task	Dataset	Result
Correlation with radiologist-derived clinically significant error counts	ReXVal BERTScore-optimized candidate reports (n = 50)	Kendall Tau0.54	12
Correlation with radiologist-derived clinically significant error counts	ReXVal RadGraph-optimized candidate reports (n = 50)	Kendall τ0.59	12
Correlation with radiologist-derived clinically significant error counts	ReXVal BLEU-optimized candidate reports (n = 50)	Kendall Tau0.64	12
Correlation with radiologist-derived clinically significant error counts	ReXVal CheXbert-optimized candidate reports (n = 50)	Kendall's τ0.41	12
Metric Correlation with Human Judgment	Merlin	Pearson Correlation0.369	7
Metric Correlation with Human Judgment	CT-RATE	Pearson Correlation0.163	7
Metric Sensitivity Analysis	Quilt-1M Control	Score31	5
Metric Sensitivity Analysis	Quilt-1M Visual Hallucination	Performance Score0.19	5
Metric Sensitivity Analysis	Quilt-1M Logic Error	Score25	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord