M3D: Advancing 3D Medical Image Analysis with Multi-Modal Large Language Models
About
Medical image analysis is essential to clinical diagnosis and treatment, which is increasingly supported by multi-modal large language models (MLLMs). However, previous research has primarily focused on 2D medical images, leaving 3D images under-explored, despite their richer spatial information. This paper aims to advance 3D medical image analysis with MLLMs. To this end, we present a large-scale 3D multi-modal medical dataset, M3D-Data, comprising 120K image-text pairs and 662K instruction-response pairs specifically tailored for various 3D medical tasks, such as image-text retrieval, report generation, visual question answering, positioning, and segmentation. Additionally, we propose M3D-LaMed, a versatile multi-modal large language model for 3D medical image analysis. Furthermore, we introduce a new 3D multi-modal medical benchmark, M3D-Bench, which facilitates automatic evaluation across eight tasks. Through comprehensive evaluation, our method proves to be a robust model for 3D medical image analysis, outperforming existing solutions. All code, data, and models are publicly available at: https://github.com/BAAI-DCAI/M3D.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Multi-Modal Visual Question Answering (MMVQA) | RAD-ChestCT (val) | Accuracy24.28 | 57 | |
| Multi-Modal Visual Question Answering (MMVQA) | CT-RATE (val) | Accuracy22.68 | 57 | |
| Classification | Rad-ChestCT | AUC69.8 | 25 | |
| Classification | CC-CCII | Accuracy83.8 | 24 | |
| Classification | CT-RATE | AUC0.807 | 24 | |
| Classification | LUNA16 | AUC0.684 | 16 | |
| 3D CT Captioning and Question Answering | M3D | Captioning Score46.3 | 9 | |
| Medical Report Generation and Diagnostic Classification | Alzheimer's Disease Progression CN vs. CI | BLEU Score0.3627 | 9 | |
| Medical Report Generation and Diagnostic Classification | Alzheimer's Disease Progression CN vs. MCI | BLEU0.3437 | 9 | |
| Recognition | DeepTumorVQA refined subset 2025b | Colon Lesion Existence60 | 9 |