Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

WSI-LLaVA: A Multimodal Large Language Model for Whole Slide Image

About

Recent advancements in computational pathology have produced patch-level Multi-modal Large Language Models (MLLMs), but these models are limited by their inability to analyze whole slide images (WSIs) comprehensively and their tendency to bypass crucial morphological features that pathologists rely on for diagnosis. To address these challenges, we first introduce WSI-Bench, a large-scale morphology-aware benchmark containing 180k VQA pairs from 9,850 WSIs across 30 cancer types, designed to evaluate MLLMs' understanding of morphological characteristics crucial for accurate diagnosis. Building upon this benchmark, we present WSI-LLaVA, a novel framework for gigapixel WSI understanding that employs a three-stage training approach: WSI-text alignment, feature space alignment, and task-specific instruction tuning. To better assess model performance in pathological contexts, we develop two specialized WSI metrics: WSI-Precision and WSI-Relevance. Experimental results demonstrate that WSI-LLaVA outperforms existing models across all capability dimensions, with a significant improvement in morphological analysis, establishing a clear correlation between morphological understanding and diagnostic accuracy.

Yuci Liang, Xinheng Lyu, Wenting Chen, Meidan Ding, Jipeng Zhang, Xiangjian He, Song Wu, Xiaohan Xing, Sen Yang, Xiyue Wang, Linlin Shen• 2024

Related benchmarks

TaskDatasetResultRank
Visual Question AnsweringSlideBench-VQA TCGA
Microscopy Score56.08
32
Visual Question AnsweringSlideBench-VQA BCNB
Overall55.3
25
Visual Question AnsweringWSI-VQA
Overall Accuracy42.05
25
Visual Question AnsweringPathMMU Tiny 1.0 (test)
Overall Accuracy44.83
24
Visual Question AnsweringPathMMU 1.0 (ALL test)
Overall Score43.82
22
Whole-slide image visual-question answeringCPTAC
Accuracy72.1
14
Whole-slide image visual-question answeringSlideBench TCGA
Accuracy60.2
14
Open-ended Pathology AnalysisPathReasoner (test)
BLEU0.1
14
Whole Slide Image AnalysisWSI-Bench (test)
Morphological Analysis Open WSI P48.8
10
Multi-scale AnalysisHepatoPathoBench
WSI Score65
7
Showing 10 of 12 rows

Other info

Follow for update