Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CPLIP: Zero-Shot Learning for Histopathology with Comprehensive Vision-Language Alignment

About

This paper proposes Comprehensive Pathology Language Image Pre-training (CPLIP), a new unsupervised technique designed to enhance the alignment of images and text in histopathology for tasks such as classification and segmentation. This methodology enriches vision-language models by leveraging extensive data without needing ground truth annotations. CPLIP involves constructing a pathology-specific dictionary, generating textual descriptions for images using language models, and retrieving relevant images for each text snippet via a pre-trained model. The model is then fine-tuned using a many-to-many contrastive learning method to align complex interrelated concepts across both modalities. Evaluated across multiple histopathology tasks, CPLIP shows notable improvements in zero-shot learning scenarios, outperforming existing methods in both interpretability and robustness and setting a higher benchmark for the application of vision-language models in the field. To encourage further research and replication, the code for CPLIP is available on GitHub at https://cplip.github.io/

Sajid Javed, Arif Mahmood, Iyyakutti Iyappan Ganapathi, Fayaz Ali Dharejo, Naoufel Werghi, Mohammed Bennamoun• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationDigestPath (test)
DSC68.7
29
Tile-level classificationPatchCamelyon
F156.7
24
Tile-level classificationBACH
Weighted F1 Score56.3
22
Tile-level classificationWSSS4LUAD
Weighted F1 Score88.2
16
Tile-level classificationNCT-CRC
Weighted F1 Score84.4
16
Tile-level classificationSICAP
Weighted Avg F10.511
16
Tile-level classificationDigestPath
Weighted F1 Score90.7
16
WSI-level classificationCAMELYON-16
F1 (Weighted)63.2
16
Tile-level classificationDatabiox
Weighted F1 Score0.487
16
Tile-level classificationSkinCancer
Weighted F1 Score47.6
16
Showing 10 of 24 rows

Other info

Follow for update