BiomedCoOp: Learning to Prompt for Biomedical Vision-Language Models

About

Recent advancements in vision-language models (VLMs), such as CLIP, have demonstrated substantial success in self-supervised representation learning for vision tasks. However, effectively adapting VLMs to downstream applications remains challenging, as their accuracy often depends on time-intensive and expertise-demanding prompt engineering, while full model fine-tuning is costly. This is particularly true for biomedical images, which, unlike natural images, typically suffer from limited annotated datasets, unintuitive image contrasts, and nuanced visual features. Recent prompt learning techniques, such as Context Optimization (CoOp) intend to tackle these issues, but still fall short in generalizability. Meanwhile, explorations in prompt learning for biomedical image analysis are still highly limited. In this work, we propose BiomedCoOp, a novel prompt learning framework that enables efficient adaptation of BiomedCLIP for accurate and highly generalizable few-shot biomedical image classification. Our approach achieves effective prompt context learning by leveraging semantic consistency with average prompt ensembles from Large Language Models (LLMs) and knowledge distillation with a statistics-based prompt selection strategy. We conducted comprehensive validation of our proposed framework on 11 medical datasets across 9 modalities and 10 organs against existing state-of-the-art methods, demonstrating significant improvements in both accuracy and generalizability. The code is publicly available at https://github.com/HealthX-Lab/BiomedCoOp.

Taha Koleilat, Hojat Asgariandehkordi, Hassan Rivaz, Yiming Xiao• 2024

Related benchmarks

Task	Dataset	Result
Image Classification	OrganSMNIST	Accuracy28.92	152
Medical Image Classification	BUSI	--	126
Classification	PneumoniaMNIST	Accuracy70.51	94
Medical Image Classification	COVID	Accuracy78.72	91
Image Classification	BreastMNIST	Accuracy58.76	74
Biomedical Image Classification	11 Biomedical Datasets Average (test)	Accuracy (Avg)72.42	73
Medical Image Classification	DermaMNIST	Accuracy30.02	63
Medical Image Classification	PathMNIST	Accuracy72.59	61
Image Classification	BUSI	Accuracy70.34	58
Image Classification	Kvasir	Mean Accuracy78.89	51

Showing 10 of 62 rows

Other info

Code

Follow for update

@wizwand_team Discord