Point-MoE: Large-Scale Multi-Dataset Training with Mixture-of-Experts for 3D Semantic Segmentation

About

While massively scaling both data and models have become central in NLP and 2D vision, their benefits for 3D point cloud understanding remain limited. We study the initial step of scaling 3D point cloud understanding under a realistic regime: large-scale multi-dataset joint training for 3D semantic segmentation, with no dataset labels available at training or inference time. Point clouds arise from a wide range of sensors (e.g., depth cameras, LiDAR) and scenes (\eg, indoor, outdoor), yielding heterogeneous scanning patterns, sampling densities, and semantic biases; naively mixing such datasets degrades standard models. Therefore, we introduce Point-MoE, a Mixture-of-Experts design that expands model capacity through sparsely activated expert MLPs and a lightweight top-$k$ router, allowing tokens to select specialized experts without requiring dataset supervision. Trained jointly on a diverse mix of indoor and outdoor datasets, and evaluated on seen datasets as well as in zero-shot settings, Point-MoE outperforms prior methods without using dataset labels for either training or inference. This outlines a scalable path for 3D perception: letting the model discover structure in heterogeneous 3D data rather than imposing it via manual curation or dataset-specific heuristics.

Xuweiyi Chen, Wentao Zhou, Aruni RoyChowdhury, Zezhou Cheng• 2025

Related benchmarks

Task	Dataset	Result
Semantic segmentation	S3DIS (Area 5)	mIOU68.1	1029
Semantic segmentation	nuScenes (val)	mIoU (Segmentation)0.689	323
Semantic segmentation	ScanNet (val)	mIoU76.2	302
Semantic segmentation	SemanticKITTI (val)	mIoU63.3	212
3D Semantic Segmentation	ScanNet (val)	mIoU77.4	144
3D Semantic Segmentation	SemanticKITTI (val)	mIoU66.4	75
3D Semantic Segmentation	nuScenes (val)	mIoU72.9	55
Semantic segmentation	Structured3D (val)	mIoU70.1	49
3D Semantic Segmentation	S3DIS Area5	mIoU69.5	41
Semantic segmentation	Structured3D (test)	mIoU70.8	34

Showing 10 of 19 rows

Other info

Follow for update

@wizwand_team Discord