DoReMi: Bridging 3D Domains via Topology-Aware Domain-Representation Mixture of Experts
About
Constructing a unified 3D scene understanding model has long been hindered by the significant topological discrepancies across different sensor modalities. While applying the Mixture-of-Experts (MoE) architecture is an effective approach to achieving universal understanding, we observe that existing 3D MoE networks often suffer from semantics-driven routing bias. This makes it challenging to address cross-domain data characterized by "semantic consistency yet topological heterogeneity." To overcome this challenge, we propose DoReMi (Topology-Aware Domain-Representation Mixture of Experts). Specifically, we introduce a self-supervised pre-training branch based on multi attributes, such as topological and texture variations, to anchor cross-domain structural priors. Building upon this, we design a domain-aware expert branch comprising two core mechanisms: Domain Spatial-Guided Routing (DSR), which achieves an acute perception of local topological variations by extracting spatial contexts, and Entropy-controlled Dynamic Allocation (EDA), which dynamically adjusts the number of activated experts by quantifying routing uncertainty to ensure training stability. Through the synergy of these dual branches, DoReMi achieves a deep integration of universal feature extraction and highly adaptive expert allocation. Extensive experiments across various tasks, encompassing both indoor and outdoor scenes, validate the superiority of DoReMi. It achieves 80.1% mIoU on the ScanNet validation set and 77.2% mIoU on S3DIS, comprehensively outperforming existing state-of-the-art methods. The code will be released soon.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Semantic segmentation | S3DIS (Area 5) | mIOU77.2 | 907 | |
| Semantic segmentation | nuScenes (val) | mIoU (Segmentation)0.817 | 265 | |
| Semantic segmentation | ScanNet200 (val) | mIoU37.2 | 126 | |
| Semantic segmentation | Matterport3D (val) | mIoU57.1 | 22 | |
| Outdoor Semantic Segmentation | Waymo (val) | mIoU72.7 | 18 | |
| Indoor Semantic Segmentation | ScanNet (val) | mIoU80.1 | 17 |