Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View Instance Segmentation and Maximum Likelihood Estimation

About

Open-world 3D part segmentation is pivotal in diverse applications such as robotics and AR/VR. Traditional supervised methods often grapple with limited 3D data availability and struggle to generalize to unseen object categories. PartSLIP, a recent advancement, has made significant strides in zero- and few-shot 3D part segmentation. This is achieved by harnessing the capabilities of the 2D open-vocabulary detection module, GLIP, and introducing a heuristic method for converting and lifting multi-view 2D bounding box predictions into 3D segmentation masks. In this paper, we introduce PartSLIP++, an enhanced version designed to overcome the limitations of its predecessor. Our approach incorporates two major improvements. First, we utilize a pre-trained 2D segmentation model, SAM, to produce pixel-wise 2D segmentations, yielding more precise and accurate annotations than the 2D bounding boxes used in PartSLIP. Second, PartSLIP++ replaces the heuristic 3D conversion process with an innovative modified Expectation-Maximization algorithm. This algorithm conceptualizes 3D instance segmentation as unobserved latent variables, and then iteratively refines them through an alternating process of 2D-3D matching and optimization with gradient descent. Through extensive evaluations, we show that PartSLIP++ demonstrates better performance over PartSLIP in both low-shot 3D semantic and instance-based object part segmentation tasks. Code released at https://github.com/zyc00/PartSLIP2.

Yuchen Zhou, Jiayuan Gu, Xuanlin Li, Minghua Liu, Yunhao Fang, Hao Su• 2023

Related benchmarks

TaskDatasetResultRank
Part SegmentationPartNet (test)--
19
Part SegmentationPartNetE
mIoU62.6
16
Part SegmentationPartNet-E few-shot
mIoU (Bottle)85.5
11
Promptable 3D semantic segmentation3Dcompat Coarse
mIoU (Canonical, Part/Object)6.12
5
Promptable 3D semantic segmentation3Dcompat Fine
mIoU (Canonical, Part of Obj)3.79
5
Promptable 3D semantic segmentationShapeNet part
mIoU (Canonical, Part of Obj.)1.43
5
Promptable 3D semantic segmentationPartNet-E
mIoU (Canonical, Part of Obj)5.12
5
Showing 7 of 7 rows

Other info

Follow for update