Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds

About

Most existing self-supervised learning (SSL) approaches for 3D point clouds are dominated by generative methods based on Masked Autoencoders (MAE). However, these generative methods have been proven to struggle to capture high-level discriminative features effectively, leading to poor performance on linear probing and other downstream tasks. In contrast, contrastive methods excel in discriminative feature representation and generalization ability on image data. Despite this, contrastive learning (CL) in 3D data remains scarce. Besides, simply applying CL methods designed for 2D data to 3D fails to effectively learn 3D local details. To address these challenges, we propose a novel Dual-Branch \textbf{C}enter-\textbf{S}urrounding \textbf{Con}trast (CSCon) framework. Specifically, we apply masking to the center and surrounding parts separately, constructing dual-branch inputs with center-biased and surrounding-biased representations to better capture rich geometric information. Meanwhile, we introduce a patch-level contrastive loss to further enhance both high-level information and local sensitivity. Under the FULL and ALL protocols, CSCon achieves performance comparable to generative methods; under the MLP-LINEAR, MLP-3, and ONLY-NEW protocols, our method attains state-of-the-art results, even surpassing cross-modal approaches. In particular, under the MLP-LINEAR protocol, our method outperforms the baseline (Point-MAE) by \textbf{7.9\%}, \textbf{6.7\%}, and \textbf{10.3\%} on the three variants of ScanObjectNN, respectively. The code will be made publicly available.

Shaofeng Zhang, Xuanqi Chen, Xiangdong Zhang, Sitong Wu, Junchi Yan• 2025

Related benchmarks

TaskDatasetResultRank
Semantic segmentationS3DIS (Area 5)
mIOU61.1
799
Part SegmentationShapeNetPart (test)
mIoU (Inst.)86.2
312
Few-shot classificationModelNet40 5-way 20-shot
Accuracy99.4
79
Few-shot classificationModelNet40 10-way 20-shot
Accuracy96.5
79
Few-shot classificationModelNet40 5-way 10-shot
Accuracy97.5
79
Few-shot classificationModelNet40 10-way 10-shot
Accuracy93.6
79
3D Object ClassificationModelNet40 1k P
Accuracy94.1
61
3D Object ClassificationScanObjectNN PB_T50_RS (FULL Protocol)
Accuracy90.42
25
3D Object ClassificationScanObjectNN OBJ_BG (FULL Protocol)
Accuracy95.35
23
3D Object ClassificationScanObjectNN OBJ_ONLY FULL Protocol
Accuracy92.77
23
Showing 10 of 12 rows

Other info

Follow for update