Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Point-MoE: Large-Scale Multi-Dataset Training with Mixture-of-Experts for 3D Semantic Segmentation

About

While massively scaling both data and models have become central in NLP and 2D vision, their benefits for 3D point cloud understanding remain limited. We study the initial step of scaling 3D point cloud understanding under a realistic regime: large-scale multi-dataset joint training for 3D semantic segmentation, with no dataset labels available at training or inference time. Point clouds arise from a wide range of sensors (e.g., depth cameras, LiDAR) and scenes (\eg, indoor, outdoor), yielding heterogeneous scanning patterns, sampling densities, and semantic biases; naively mixing such datasets degrades standard models. Therefore, we introduce Point-MoE, a Mixture-of-Experts design that expands model capacity through sparsely activated expert MLPs and a lightweight top-$k$ router, allowing tokens to select specialized experts without requiring dataset supervision. Trained jointly on a diverse mix of indoor and outdoor datasets, and evaluated on seen datasets as well as in zero-shot settings, Point-MoE outperforms prior methods without using dataset labels for either training or inference. This outlines a scalable path for 3D perception: letting the model discover structure in heterogeneous 3D data rather than imposing it via manual curation or dataset-specific heuristics.

Xuweiyi Chen, Wentao Zhou, Aruni RoyChowdhury, Zezhou Cheng• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU68.1
907
Semantic segmentationScanNet (val)
mIoU76.2
274
Semantic segmentationnuScenes (val)
mIoU (Segmentation)0.689
265
Semantic segmentationSemanticKITTI (val)
mIoU63.3
174
3D Semantic SegmentationScanNet (val)
mIoU77.4
144
3D Semantic SegmentationSemanticKITTI (val)
mIoU66.4
57
Semantic segmentationStructured3D (val)
mIoU70.1
49
3D Semantic SegmentationnuScenes (val)
mIoU72.9
43
3D Semantic SegmentationS3DIS Area5
mIoU69.5
41
Semantic segmentationStructured3D (test)
mIoU70.8
34
Showing 10 of 19 rows

Other info

Follow for update