PartSLIP++: Enhancing Low-Shot 3D Part Segmentation via Multi-View Instance Segmentation and Maximum Likelihood Estimation

About

Open-world 3D part segmentation is pivotal in diverse applications such as robotics and AR/VR. Traditional supervised methods often grapple with limited 3D data availability and struggle to generalize to unseen object categories. PartSLIP, a recent advancement, has made significant strides in zero- and few-shot 3D part segmentation. This is achieved by harnessing the capabilities of the 2D open-vocabulary detection module, GLIP, and introducing a heuristic method for converting and lifting multi-view 2D bounding box predictions into 3D segmentation masks. In this paper, we introduce PartSLIP++, an enhanced version designed to overcome the limitations of its predecessor. Our approach incorporates two major improvements. First, we utilize a pre-trained 2D segmentation model, SAM, to produce pixel-wise 2D segmentations, yielding more precise and accurate annotations than the 2D bounding boxes used in PartSLIP. Second, PartSLIP++ replaces the heuristic 3D conversion process with an innovative modified Expectation-Maximization algorithm. This algorithm conceptualizes 3D instance segmentation as unobserved latent variables, and then iteratively refines them through an alternating process of 2D-3D matching and optimization with gradient descent. Through extensive evaluations, we show that PartSLIP++ demonstrates better performance over PartSLIP in both low-shot 3D semantic and instance-based object part segmentation tasks. Code released at https://github.com/zyc00/PartSLIP2.

Yuchen Zhou, Jiayuan Gu, Xuanlin Li, Minghua Liu, Yunhao Fang, Hao Su• 2023

Related benchmarks

Task	Dataset	Result
Part Segmentation	PartNet (test)	--	19
Part Segmentation	PartNetE	mIoU62.6	16
Part Segmentation	PartNet-E few-shot	mIoU (Bottle)85.5	11
Promptable 3D semantic segmentation	3Dcompat Coarse	mIoU (Canonical, Part/Object)6.12	5
Promptable 3D semantic segmentation	3Dcompat Fine	mIoU (Canonical, Part of Obj)3.79	5
Promptable 3D semantic segmentation	ShapeNet part	mIoU (Canonical, Part of Obj.)1.43	5
Promptable 3D semantic segmentation	PartNet-E	mIoU (Canonical, Part of Obj)5.12	5

Showing 7 of 7 rows

Other info

Follow for update

@wizwand_team Discord