Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

OP2GS: Object-Aware 3D Gaussian Splatting with Dual-Opacity Primitives

About

3D Gaussian Splatting (3DGS) provides an explicit and efficient scene representation, but its primitives lack inherent object-level identity, hindering downstream tasks such as open-vocabulary scene understanding. Existing methods typically address this by either distilling high-dimensional feature embeddings into Gaussians or by lifting 2D mask labels into 3D via heuristic refinement. However, feature-based approaches incur heavy storage and decoding overhead, while lifting-based pipelines remain vulnerable to label contamination: Gaussians necessary for appearance reconstruction often receive incorrect object labels during 2D-to-3D projection. We propose OP2GS, an object-aware Gaussian representation that augments each primitive with an explicit instance identity and a dedicated instance opacity $\sigma^{*}$ for object-mask rendering. The original opacity $\sigma$ remains responsible for visual reconstruction, while $\sigma^{*}$ models whether a Gaussian should contribute to a particular object mask. This dual-opacity formulation decouples visual existence from instance occupancy: mislabeled Gaussians can remain available for image rendering while becoming transparent in the object-mask branch. To learn this representation, we introduce a random object loss that optimizes the 1D instance occupancy field using the standard transmittance-based visibility of 3DGS. Semantic descriptors are then attached at the object level through multi-view aggregation, eliminating per-Gaussian feature storage. Compared with feature-training approaches, OP2GS achieves competitive open-vocabulary performance while significantly reducing computational overhead. Compared with training-free pipelines, it leverages physically consistent occupancy learning to resolve visibility ambiguities.

Guiyu Liu, Niklas Vaara, Janne Mustaniemi, Juho Kannala, Janne Heikkil\"a• 2026

Related benchmarks

TaskDatasetResultRank
3D Semantic Segmentation3D-OVS
Bed97.5
42
Open-Vocabulary 3D SegmentationLERF-Mask (test)
Figurines mIoU92.3
19
Novel View RenderingLeRF-mask
PSNR25.21
4
3D Open-vocabulary SegmentationLERF-Mask figurines scene
Rendering Time8.2
3
Showing 4 of 4 rows

Other info

Follow for update