| Task Name | Dataset Name | SOTA Result | Trend | |
|---|---|---|---|---|
| Visual Question Answering | M3D 1.0 (test) | Accuracy77.79 | 10 | |
| 3D CT Captioning and Question Answering | M3D | Captioning Score46.3 | 9 | |
| Medical Visual Question Answering | M3D | Plane Score92.8 | 9 | |
| Multimodal Information Extraction | M3D Chinese (ZH) | Entity Recognition F182.26 | 7 | |
| Multimodal Information Extraction | M3D English | Entity Recognition F179.56 | 7 |