Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

LOSC: LiDAR Open-voc Segmentation Consolidator

About

We study the use of image-based Vision-Language Models (VLMs) for open-vocabulary segmentation of lidar scans in driving settings. Classically, image semantics can be back-projected onto 3D point clouds. Yet, resulting point labels are noisy and sparse. We consolidate these labels to enforce both spatio-temporal consistency and robustness to image-level augmentations. We then train a 3D network based on these refined labels. This simple method, called LOSC, outperforms the SOTA of zero-shot open-vocabulary semantic and panoptic segmentation on both nuScenes and SemanticKITTI, with significant margins. Code is available at https://github.com/valeoai/LOSC.

Nermin Samet, Gilles Puy, Renaud Marlet• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationnuScenes (val)
mIoU (Segmentation)0.493
265
Semantic segmentationSemanticKITTI (val)
mIoU35.2
174
Panoptic SegmentationnuScenes (val)
PQ48.4
56
LiDAR Panoptic SegmentationSemanticKITTI (val)
PQ32.4
38
Annotation-free closed-set semantic segmentationnuScenes (val)
mIoU49.3
16
Annotation-free closed-set semantic segmentationSemanticKITTI (val)
mIoU35.2
6
Showing 6 of 6 rows

Other info

Follow for update