Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Multimodal LLM With Hierarchical Mixture-of-Experts for VQA on 3D Brain MRI

About

Multiparametric 3D brain MRI (mpMRI) is central to neuroradiology, but producing tumor location, appearance, size, and involvement of critical structures for neurosurgical planning remains challenging. We introduce mpLLM, a multimodal LLM for visual question answering (VQA) on mpMRI that produces clinically interpretable tumor descriptors (e.g., volume, morphology, extent, and coarse localization) as an adjunct to clinical expertise for referring neurosurgeons. mpLLM uses a prompt-conditioned hierarchical mixture-of-experts (MoE) to fuse multiple 3D sequences via routing over modality- and token-level projection experts, enabling data-efficient end-to-end training without large-scale image-report pretraining. To address limited paired image-text supervision, we propose a synthetic VQA protocol that derives clinically grounded questions and answers from expert segmentation annotations and is validated with radiologist collaboration. Across multiple mpMRI datasets, mpLLM improves over strong medical VLM baselines by +5.5 points on average (+9.1% relative) and increases radiologist-rated clinical acceptability by +15.9 points (+46.6% relative). Our study features three main contributions: (1) the first VQA dataset for 3D brain mpMRI, (2) a hierarchical MoE architecture for joint reasoning over interrelated 3D sequences, and (3) expert-supported evidence of clinical utility. Source code is available at https://github.com/arvindmvepa/mpllm, and we will release the dataset upon publication.

Arvind Murari Vepa, Yannan Yu, Jingru Gan, Anthony Cuturrufo, Michael F. Romano, Weikai Li, Fabien Scalzo, Wei Wang, Yizhou Sun• 2025

Related benchmarks

TaskDatasetResultRank
VQAGLI (test)
Volume62.5
6
VQAGOAT (test)
Volume63
6
VQAMet (test)
Volume Score65.7
6
Visual Question AnsweringGLI (val)
Volume Score43
3
ClassificationGLI (Primary Gliomas vs Secondary Metastatic Lesions) (val)
Accuracy95.6
2
Radiologist AcceptanceGLI (val)
Radiologist Acceptance Rate50
2
Showing 6 of 6 rows

Other info

Follow for update