Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

About

Using multiple spatial modalities has been proven helpful in improving semantic segmentation performance. However, there are several real-world challenges that have yet to be addressed: (a) improving label efficiency and (b) enhancing robustness in realistic scenarios where modalities are missing at the test time. To address these challenges, we first propose a simple yet efficient multi-modal fusion mechanism Linear Fusion, that performs better than the state-of-the-art multi-modal models even with limited supervision. Second, we propose M3L: Multi-modal Teacher for Masked Modality Learning, a semi-supervised framework that not only improves the multi-modal performance but also makes the model robust to the realistic missing modality scenario using unlabeled data. We create the first benchmark for semi-supervised multi-modal semantic segmentation and also report the robustness to missing modalities. Our proposal shows an absolute improvement of up to 10% on robust mIoU above the most competitive baselines. Our code is available at https://github.com/harshm121/M3L

Harsh Maheshwari, Yen-Cheng Liu, Zsolt Kira• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationDeLiVER
mIoU (Mean)47.64
30
Semantic segmentationDFC 2023
R Score94.14
20
Semantic segmentationISPRS (test)
R41.27
10
Semantic segmentationISPRS
mIoU (R)30.72
10
Semantic segmentationStanford Indoor (0.1% labeled (49 samples))
mIoU40.05
8
Semantic segmentationStanford Indoor 0.2% labeled (98 samples)
mIoU44.62
8
Semantic segmentationStanford Indoor 1% labeled (491 samples)
mIoU49.28
8
Semantic segmentationSUN RGBD 6.25% (297) labeled
mIoU (RGB)29.92
6
Semantic segmentationSUN RGBD (12.5% (594) labeled)
mIoU (RGB)38.12
6
Semantic segmentationSUN RGBD 25% (1189) labeled
mIoU (RGB)41.31
6
Showing 10 of 10 rows

Other info

Code

Follow for update