Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

CPath-Omni: A Unified Multimodal Foundation Model for Patch and Whole Slide Image Analysis in Computational Pathology

About

The emergence of large multimodal models (LMMs) has brought significant advancements to pathology. Previous research has primarily focused on separately training patch-level and whole-slide image (WSI)-level models, limiting the integration of learned knowledge across patches and WSIs, and resulting in redundant models. In this work, we introduce CPath-Omni, the first 15-billion-parameter LMM designed to unify both patch and WSI level image analysis, consolidating a variety of tasks at both levels, including classification, visual question answering, captioning, and visual referring prompting. Extensive experiments demonstrate that CPath-Omni achieves state-of-the-art (SOTA) performance across seven diverse tasks on 39 out of 42 datasets, outperforming or matching task-specific models trained for individual tasks. Additionally, we develop a specialized pathology CLIP-based visual processor for CPath-Omni, CPath-CLIP, which, for the first time, integrates different vision models and incorporates a large language model as a text encoder to build a more powerful CLIP model, which achieves SOTA performance on nine zero-shot and four few-shot datasets. Our findings highlight CPath-Omni's ability to unify diverse pathology tasks, demonstrating its potential to streamline and advance the field of foundation model in pathology.

Yuxuan Sun, Yixuan Si, Chenglu Zhu, Xuan Gong, Kai Zhang, Pingyi Chen, Ye Zhang, Zhongyi Shui, Tao Lin, Lin Yang• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationPCAM
Top-1 Acc95.9
58
Visual Question AnsweringSlideBench-VQA TCGA
Microscopy Score63.7
32
Visual Question AnsweringPathMMU Tiny 1.0 (test)
Overall Accuracy72.4
24
Visual Question AnsweringPathMMU 1.0 (ALL test)
Overall Score72.2
22
ClassificationBACH
Accuracy72.3
19
WSI ClassificationTCGA-RCC--
18
Pathological Multimodal UnderstandingPathMMU ALL (test)
PubMed Accuracy69.9
16
Gene Mutation PredictionCPTAC
BRCA PIK3CA AUC0.6423
15
Pathological Multimodal UnderstandingPathMMU Tiny (test)
PubMed Score74
15
ClassificationSkinCancer
Accuracy74.2
14
Showing 10 of 46 rows

Other info

Follow for update