Dual-Branch Center-Surrounding Contrast: Rethinking Contrastive Learning for 3D Point Clouds

About

Most existing self-supervised learning (SSL) approaches for 3D point clouds are dominated by generative methods based on Masked Autoencoders (MAE). However, these generative methods have been proven to struggle to capture high-level discriminative features effectively, leading to poor performance on linear probing and other downstream tasks. In contrast, contrastive methods excel in discriminative feature representation and generalization ability on image data. Despite this, contrastive learning (CL) in 3D data remains scarce. Besides, simply applying CL methods designed for 2D data to 3D fails to effectively learn 3D local details. To address these challenges, we propose a novel Dual-Branch \textbf{C}enter-\textbf{S}urrounding \textbf{Con}trast (CSCon) framework. Specifically, we apply masking to the center and surrounding parts separately, constructing dual-branch inputs with center-biased and surrounding-biased representations to better capture rich geometric information. Meanwhile, we introduce a patch-level contrastive loss to further enhance both high-level information and local sensitivity. Under the FULL and ALL protocols, CSCon achieves performance comparable to generative methods; under the MLP-LINEAR, MLP-3, and ONLY-NEW protocols, our method attains state-of-the-art results, even surpassing cross-modal approaches. In particular, under the MLP-LINEAR protocol, our method outperforms the baseline (Point-MAE) by \textbf{7.9\%}, \textbf{6.7\%}, and \textbf{10.3\%} on the three variants of ScanObjectNN, respectively. The code will be made publicly available.

Shaofeng Zhang, Xuanqi Chen, Xiangdong Zhang, Sitong Wu, Junchi Yan• 2025

Related benchmarks

Task	Dataset	Result
Semantic segmentation	S3DIS (Area 5)	mIOU61.1	1006
Part Segmentation	ShapeNetPart (test)	mIoU (Inst.)86.2	347
Few-shot classification	ModelNet40 10-way 20-shot	Accuracy96.5	117
Few-shot classification	ModelNet40 10-way 10-shot	Accuracy93.6	117
Few-shot classification	ModelNet40 5-way 20-shot	Accuracy99.4	102
Few-shot classification	ModelNet40 5-way 10-shot	Accuracy97.5	102
3D Object Classification	ModelNet40 1k P	Accuracy94.1	61
3D Object Classification	ScanObjectNN	Accuracy89.8	35
3D Object Classification	ScanObjectNN PB_T50_RS (FULL Protocol)	Accuracy90.42	25
3D Object Classification	ScanObjectNN OBJ_BG (FULL Protocol)	Accuracy95.35	23

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord