DoReMi: Bridging 3D Domains via Topology-Aware Domain-Representation Mixture of Experts

About

Constructing a unified 3D scene understanding model has long been hindered by the significant topological discrepancies across different sensor modalities. While applying the Mixture-of-Experts (MoE) architecture is an effective approach to achieving universal understanding, we observe that existing 3D MoE networks often suffer from semantics-driven routing bias. This makes it challenging to address cross-domain data characterized by "semantic consistency yet topological heterogeneity." To overcome this challenge, we propose DoReMi (Topology-Aware Domain-Representation Mixture of Experts). Specifically, we introduce a self-supervised pre-training branch based on multi attributes, such as topological and texture variations, to anchor cross-domain structural priors. Building upon this, we design a domain-aware expert branch comprising two core mechanisms: Domain Spatial-Guided Routing (DSR), which achieves an acute perception of local topological variations by extracting spatial contexts, and Entropy-controlled Dynamic Allocation (EDA), which dynamically adjusts the number of activated experts by quantifying routing uncertainty to ensure training stability. Through the synergy of these dual branches, DoReMi achieves a deep integration of universal feature extraction and highly adaptive expert allocation. Extensive experiments across various tasks, encompassing both indoor and outdoor scenes, validate the superiority of DoReMi. It achieves 80.1% mIoU on the ScanNet validation set and 77.2% mIoU on S3DIS, comprehensively outperforming existing state-of-the-art methods. The code will be released soon.

Mingwei Xing, Xinliang Wang, Yifeng Shi• 2025

Related benchmarks

Task	Dataset	Result
Semantic segmentation	S3DIS (Area 5)	mIOU77.2	1006
Semantic segmentation	nuScenes (val)	mIoU (Segmentation)0.817	323
Semantic segmentation	ScanNet200 (val)	mIoU37.2	136
Indoor Semantic Segmentation	ScanNet (val)	mIoU80.1	30
Semantic segmentation	Matterport3D (val)	mIoU57.1	22
Outdoor Semantic Segmentation	Waymo (val)	mIoU72.7	18

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord