Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Point-In-Context: Understanding Point Cloud via In-Context Learning

About

The rise of large-scale models has catalyzed in-context learning as a powerful approach for multitasking, particularly in natural language and image processing. However, its application to 3D point cloud tasks has been largely unexplored. In this paper, we introduce Point-In-Context (PIC), a pioneering framework for 3D point cloud understanding that leverages in-context learning with a standard transformer architecture. PIC uniquely enables the execution of multiple tasks after a single, unified training phase, eliminating the need for fine-tuning. To extend masked point modeling to 3D in-context learning, we introduce a Joint Sampling module, a simple yet effective technique that emphasizes the mapping relationship between input and target. PIC treats both inputs and targets as coordinate-based, addressing the segmentation challenge by associating label points with pre-defined XYZ coordinates for each category. However, relying on such fixed label-coordinate assignments limits the model's ability to generalize to unseen domains. To address this limitation, we further propose two innovative training strategies: In-Context Labeling and In-Context Enhancing. These strategies are integrated into PIC++, which enhances dynamic in-context labeling and model training. Besides its multitask capability, PIC++ demonstrates generalization across part segmentation datasets by employing dynamic in-context labels and regular in-context pairs. Remarkably, PIC++, trained once without fine-tuning, can generalize effectively to unseen datasets and perform novel part segmentation through customized prompts. Overall, PIC is a general framework that seamlessly integrates additional tasks or datasets through a unified data format via in-context learning. Extensive experiments substantiate PIC's versatility and adaptability in handling diverse tasks and segmenting multiple datasets simultaneously.

Mengyuan Liu, Zhongbin Fang, Xia Li, Joachim M. Buhmann, Deheng Ye, Xiangtai Li, Chen Change Loy• 2024

Related benchmarks

TaskDatasetResultRank
Part SegmentationShapeNetPart (test)--
312
DenoisingShapeNet In-Context
L1 CD Error3.8
59
ReconstructionShapeNet In-Context
CD L13.2
59
RegistrationShapeNet In-Context
L1 CD Error (x1000)6
47
Part SegmentationShapeNet In-Context
mIoU85.53
34
3D Part SegmentationShapeNetPart (val)
mIoU87.82
33
Multi-Entity SegmentationHuman3D (test)
mIoU82.82
25
Multi-Entity SegmentationHuman3D (val)
mIoU85.59
25
Multi-Entity SegmentationBEHAVE (test)
mIoU88.63
25
Multi-Entity SegmentationAKB-48
mIoU73.52
22
Showing 10 of 11 rows

Other info

Follow for update