Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Exploring the Capabilities of Large Language Model Encoders for Image-Text Retrieval in Chest X-rays

About

Multimodal learning from paired medical images and clinical text is a central challenge in medical data-driven informatics, where effective cross-modal alignment is critical for scalable analysis and retrieval. In chest radiography, vision-language pretraining is constrained by heterogeneous radiology reports that contain abbreviations, impression-only notes, and institution-specific writing styles. Unlike general-domain settings, naively aggregating large collections of noisy reports can plateau or even degrade multimodal learning when reporting styles differ substantially. We propose a domain-adapted bidirectional large language model text encoder for chest radiograph reports, trained with masked token prediction and supervised contrastive learning on stylistically diverse but clinically equivalent report variants to produce robust, generalizable text embeddings. We then integrate this encoder into a dual-tower contrastive vision-language framework using parameter-efficient adaptation to improve image-text alignment. Across 1.6 million paired studies from public datasets and a de-identified hospital cohort, the proposed models improve bidirectional retrieval accuracy and external generalization, achieving GREEN scores of 0.308 on MIMIC-CXR and 0.618 on Open-I, while reducing the degradation observed when abbreviation-rich, impression-only hospital reports are added to training.

Hanbin Ko, Gihun Cho, Inhyeok Baek, Donguk Kim, Joonbeom Koo, Changi Kim, Dongheon Lee, Chang Min Park• 2025

Related benchmarks

TaskDatasetResultRank
Unconditional Image-to-Report RetrievalMIMIC-IR Chest X-Ray
Recall@598.9
15
Medical acronym understanding retrievalChest X-ray reports
Recall@161.1
11
Report error discriminationChest X-ray reports
Accuracy84.1
11
Report summarization retrievalChest X-ray reports
Recall@121.2
11
Clinical similarity matchingChest X-ray reports
RadGraph0.402
11
Multimodal Image-Text RetrievalOpen-I 200 cases
Student Mean Rank1.85
6
Radiologist-reference rankingChest X-rays 72-case subset
Expert Mean Rank1.29
3
Radiologist-reference rankingChest X-rays 72-case (test)
Expert Mean Rank1.29
3
Showing 8 of 8 rows

Other info

Follow for update