Multi-Space Alignments Towards Universal LiDAR Segmentation

About

A unified and versatile LiDAR segmentation model with strong robustness and generalizability is desirable for safe autonomous driving perception. This work presents M3Net, a one-of-a-kind framework for fulfilling multi-task, multi-dataset, multi-modality LiDAR segmentation in a universal manner using just a single set of parameters. To better exploit data volume and diversity, we first combine large-scale driving datasets acquired by different types of sensors from diverse scenes and then conduct alignments in three spaces, namely data, feature, and label spaces, during the training. As a result, M3Net is capable of taming heterogeneous data for training state-of-the-art LiDAR segmentation models. Extensive experiments on twelve LiDAR segmentation datasets verify our effectiveness. Notably, using a shared set of parameters, M3Net achieves 75.1%, 83.1%, and 72.4% mIoU scores, respectively, on the official benchmarks of SemanticKITTI, nuScenes, and Waymo Open.

Youquan Liu, Lingdong Kong, Xiaoyang Wu, Runnan Chen, Xin Li, Liang Pan, Ziwei Liu, Yuexin Ma• 2024

Related benchmarks

Task	Dataset	Result
Semantic segmentation	SemanticKITTI (test)	mIoU75.1	353
Semantic segmentation	nuScenes (val)	mIoU (Segmentation)0.79	323
Semantic segmentation	SemanticKITTI (val)	mIoU72	212
LiDAR Semantic Segmentation	nuScenes official (test)	mIoU83.1	196
LiDAR Semantic Segmentation	nuScenes (val)	mIoU80.9	169
LiDAR Semantic Segmentation	SemanticKITTI (test)	mIoU75.1	125
LiDAR Semantic Segmentation	SemanticKITTI (val)	mIoU72	87
Semantic segmentation	Waymo Open Dataset (val)	mIoU72.4	63
LiDAR-based Panoptic Segmentation	nuScenes (val)	PQ71.7	17
Panoptic LiDAR Segmentation	Panoptic-SemanticKITTI (val)	PQ63.87	10

Showing 10 of 22 rows

Other info

Code

Follow for update

@wizwand_team Discord