Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Rad-VLSM: A Cross-Modal Framework with Semantics-Assisted Prompting for Medical Segmentation and Diagnosis

About

Medical image segmentation is more clinically valuable when it supports diagnosis rather than merely producing lesion masks. However, diagnostically relevant lesion cues are often subtle and localized, while existing models may be distracted by background tissues, acoustic artifacts, and irrelevant visual correlations. To address this problem, we propose Rad-VLSM, a two-stage cross-modal framework for semantics-assisted lesion focusing, robust segmentation, and visually grounded diagnosis. In the first stage, a BLIP-2-based vision-language alignment module identifies lesion-related candidate regions under semantic guidance and converts them into box prompts. In the second stage, these prompts are fed into a SAM-based multitask network, where a multi-candidate region aggregation strategy improves prompt stability and guides lesion segmentation. The predicted masks are then used as spatial priors for diagnosis, and a visual-radiomics fusion head integrates lesion-aware visual features with selected radiomics descriptors. By using semantic information for localization rather than direct prediction, Rad-VLSM reduces text-to-diagnosis dependence and grounds diagnosis in lesion-level evidence. Experiments on a private clinical breast ultrasound dataset and public benchmarks show that Rad-VLSM achieves strong segmentation and diagnostic performance with favorable generalization.

Fengyi Zhang, Xujie Zeng, Mohan Liu, Zengyi Wang, Yalong Jiang• 2026

Related benchmarks

TaskDatasetResultRank
Skin Lesion SegmentationISIC 2016
Dice Score (D)93.68
40
Binary Classificationclinical breast ultrasound dataset
Accuracy97.67
30
Lesion SegmentationISIC 2018
Dice Score93.31
26
Medical Image Segmentationcolon polyp
mIoU87.63
25
Segmentationclinical breast ultrasound dataset
mDSC90.75
17
Lesion Segmentationclinical breast ultrasound dataset
mDSC92.17
13
SegmentationMRI Brain Tumor
mDSC92.58
9
Showing 7 of 7 rows

Other info

Follow for update