MAKE: Multi-Aspect Knowledge-Enhanced Vision-Language Pretraining for Zero-shot Dermatological Assessment
About
Dermatological diagnosis represents a complex multimodal challenge that requires integrating visual features with specialized clinical knowledge. While vision-language pretraining (VLP) has advanced medical AI, its effectiveness in dermatology is limited by text length constraints and the lack of structured texts. In this paper, we introduce MAKE, a Multi-Aspect Knowledge-Enhanced vision-language pretraining framework for zero-shot dermatological tasks. Recognizing that comprehensive dermatological descriptions require multiple knowledge aspects that exceed standard text constraints, our framework introduces: (1) a multi-aspect contrastive learning strategy that decomposes clinical narratives into knowledge-enhanced sub-texts through large language models, (2) a fine-grained alignment mechanism that connects subcaptions with diagnostically relevant image features, and (3) a diagnosis-guided weighting scheme that adaptively prioritizes different sub-captions based on clinical significance prior. Through pretraining on 403,563 dermatological image-text pairs collected from education resources, MAKE significantly outperforms state-of-the-art VLP models on eight datasets across zero-shot skin disease classification, concept annotation, and cross-modal retrieval tasks. Our code will be made publicly available at https: //github.com/SiyuanYan1/MAKE.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Concept Annotation | SkinCon (test) | AUROC0.7873 | 11 | |
| Disease Classification | DermNet (test) | Accuracy82.66 | 11 | |
| Disease Classification | F17K (test) | Accuracy0.3242 | 11 | |
| Disease Classification | SD-128 (test) | ACC39.14 | 11 | |
| Disease Classification | SNU-134 (test) | Accuracy32.7 | 11 | |
| Image-to-Text Retrieval | SkinCAP | Recall@100.2096 | 11 | |
| Text-to-Image Retrieval | SkinCAP | Recall@1019.95 | 11 | |
| Disease Classification | PAD (test) | Accuracy0.5953 | 11 | |
| Concept Annotation | Derm7pt (test) | AUROC68.64 | 11 | |
| Medical Image Classification | PH2 Dermoscopy 2 | W_F188.2 | 6 |