BioVLM: Routing Prompts, Not Parameters, for Cross-Modality Generalization in Biomedical VLMs
About
Pretrained biomedical vision-language models (VLMs) such as BioMedCLIP perform well on average but often degrade on challenging modalities where inter-class margins are small and acquisition-specific variations are pronounced, especially under few-shot supervision and when modality priors differ from pretraining corpora substantially. We propose BioVLM, a prompt-learning framework that improves cross-domain generalization without extensive backbone fine-tuning. BioVLM learns a diverse prompt bank and introduces dynamic prompt selection: for each input, it selects the most discriminative prompts via a low-entropy criterion on the predictive distribution, effectively coupling sparse few-shot evidence with rich LLM semantic priors. To strengthen this coupling, we distill high-confidence LLM-derived attributes and enforce robust knowledge transfer through strong/weak augmentation consistency. At test time, BioVLM adapts by choosing modality-appropriate prompts, enabling transfer to unseen categories and domains, while keeping training lightweight and inference efficient. On 11 MedMNIST+ 2D datasets, BioVLM achieves new state of the art across three distinct generalization settings. Codes are available at https://github.com/mainaksingha01/BioVLM.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | OrganSMNIST | Accuracy45.41 | 152 | |
| Classification | PneumoniaMNIST | Accuracy79.65 | 94 | |
| Image Classification | BreastMNIST | Accuracy65.6 | 74 | |
| Biomedical Image Classification | 11 Biomedical Datasets Average (test) | Accuracy (Avg)77.62 | 73 | |
| Medical Image Classification | DermaMNIST | Accuracy45.27 | 63 | |
| Medical Image Classification | PathMNIST | Accuracy81.56 | 61 | |
| Medical Image Classification | OCTMNIST | Accuracy62.57 | 47 | |
| Classification | TissueMNIST | Accuracy27.34 | 40 | |
| Medical Image Classification | RetinaMNIST | Accuracy40.08 | 32 | |
| Medical Image Classification | OrganAMNIST | Accuracy54.75 | 21 |