Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Sparse Spectral LoRA: Routed Experts for Medical VLMs

About

Large vision-language models (VLMs) excel on general benchmarks but often lack robustness in medical imaging, where heterogeneous supervision induces cross-dataset interference and sensitivity to data regime (i.e., how the supervisory signals are mixed). In realistic clinical workflows, data and tasks arrive sequentially, so naive continual training further leads to catastrophic forgetting. To address these challenges, we propose MedQwen, a parameter-efficient medical VLM that couples a spectrally routed Mixture-of-Experts (MoE) with a theoretically grounded scaling rule that aligns low-rank updates with a full-rank, fully fine-tuned MoE, without changing the base architecture. Concretely, we initialize each expert from non-overlapping singular value decomposition (SVD) segments of the pretrained weight and introduce a residual compensation and scaling scheme to enable stable expert specialization and consistent routing under distribution shift. Across 23 medical datasets covering visual question answering, report generation, radiology classification, and hallucination mitigation, MedQwen achieves strong, reliable performance: it approaches full fine-tuning on zero-shot classification with 339$\times$ fewer trainable parameters, and reduces sequential forgetting to $\sim$5\% where strong baselines degrade by $>$20-50\%.

Omid Nejati Manzari, Hojat Asgariandehkordi, Taha Koleilat, Yiming Xiao, Hassan Rivaz• 2026

Related benchmarks

TaskDatasetResultRank
Medical Visual Question AnsweringVQA-RAD--
198
Medical Visual Question AnsweringPathVQA--
86
Medical Report GenerationMIMIC-CXR
ROUGE-L25.69
28
Image ClassificationBreast UltraSound (BUS) dataset
Accuracy77.23
21
Radiology VQAIU-Xray
Accuracy90.33
20
Medical Visual Question AnsweringSlake
Closed Accuracy75.3
17
ClassificationThyroid--
17
Medical Visual Question AnsweringOMVQA
Accuracy70.6
13
Hallucination EvaluationKnowledge Deficiency Hallucination Open-ended Evaluation (test)
BertScore92.75
12
Visual Misinterpretation HallucinationVisual Misinterpretation Hallucination Open-ended (test)
CheXbert Score35.8
12
Showing 10 of 24 rows

Other info

Follow for update