Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Missing Modality Robustness in Semi-Supervised Multi-Modal Semantic Segmentation

About

Using multiple spatial modalities has been proven helpful in improving semantic segmentation performance. However, there are several real-world challenges that have yet to be addressed: (a) improving label efficiency and (b) enhancing robustness in realistic scenarios where modalities are missing at the test time. To address these challenges, we first propose a simple yet efficient multi-modal fusion mechanism Linear Fusion, that performs better than the state-of-the-art multi-modal models even with limited supervision. Second, we propose M3L: Multi-modal Teacher for Masked Modality Learning, a semi-supervised framework that not only improves the multi-modal performance but also makes the model robust to the realistic missing modality scenario using unlabeled data. We create the first benchmark for semi-supervised multi-modal semantic segmentation and also report the robustness to missing modalities. Our proposal shows an absolute improvement of up to 10% on robust mIoU above the most competitive baselines. Our code is available at https://github.com/harshm121/M3L

Harsh Maheshwari, Yen-Cheng Liu, Zsolt Kira• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPotsdam (test)
mIoU72.2
193
Semantic segmentationDSTL (test)
IoU88.8
56
Semantic segmentationHunan (test)
mIoU61.8
56
Semantic segmentationDeLiVER
mIoU (Mean)47.64
30
Semantic segmentationDFC 2023
R Score94.14
20
Semantic segmentationISPRS (test)
R41.27
10
Semantic segmentationISPRS
mIoU (R)30.72
10
Semantic segmentationStanford Indoor (0.1% labeled (49 samples))
mIoU40.05
8
Semantic segmentationStanford Indoor 0.2% labeled (98 samples)
mIoU44.62
8
Semantic segmentationStanford Indoor 1% labeled (491 samples)
mIoU49.28
8
Showing 10 of 13 rows

Other info

Code

Follow for update