Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Segment Any Point Cloud Sequences by Distilling Vision Foundation Models

About

Recent advancements in vision foundation models (VFMs) have opened up new possibilities for versatile and efficient visual perception. In this work, we introduce Seal, a novel framework that harnesses VFMs for segmenting diverse automotive point cloud sequences. Seal exhibits three appealing properties: i) Scalability: VFMs are directly distilled into point clouds, obviating the need for annotations in either 2D or 3D during pretraining. ii) Consistency: Spatial and temporal relationships are enforced at both the camera-to-LiDAR and point-to-segment regularization stages, facilitating cross-modal representation learning. iii) Generalizability: Seal enables knowledge transfer in an off-the-shelf manner to downstream tasks involving diverse point clouds, including those from real/synthetic, low/high-resolution, large/small-scale, and clean/corrupted datasets. Extensive experiments conducted on eleven different point cloud datasets showcase the effectiveness and superiority of Seal. Notably, Seal achieves a remarkable 45.0% mIoU on nuScenes after linear probing, surpassing random initialization by 36.9% mIoU and outperforming prior arts by 6.1% mIoU. Moreover, Seal demonstrates significant performance gains over existing methods across 20 different few-shot fine-tuning tasks on all eleven tested point cloud datasets.

Youquan Liu, Lingdong Kong, Jun Cen, Runnan Chen, Wenwei Zhang, Liang Pan, Kai Chen, Ziwei Liu• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationnuScenes (val)
mIoU (Segmentation)0.7828
265
Semantic segmentationSemanticKITTI (val)
mIoU46.63
174
LiDAR Semantic SegmentationSemanticKITTI (val)--
87
Semantic segmentationnuScenes 1.0 (val)
mIoU75.6
81
Semantic segmentationWaymo Open Dataset (val)
mIoU49.34
63
3D Semantic SegmentationnuScenes Lidar-Seg (val)--
38
Semantic segmentationsemanticKITTI SynLiDAR source (val)
mIoU (Mean IoU)49.26
33
Semantic segmentationSemanticKITTI v1.0 (val)
mIoU46.63
30
Semantic segmentationSynth4D (val)
mIoU64.5
24
LiDAR Semantic SegmentationSemanticSTF (val)
mIoU55.36
16
Showing 10 of 28 rows

Other info

Code

Follow for update