Representation Learning for Point Cloud Understanding

About

With the rapid advancement of technology, 3D data acquisition and utilization have become increasingly prevalent across various fields, including computer vision, robotics, and geospatial analysis. 3D data, captured through methods such as 3D scanners, LiDARs, and RGB-D cameras, provides rich geometric, shape, and scale information. When combined with 2D images, 3D data offers machines a comprehensive understanding of their environment, benefiting applications like autonomous driving, robotics, remote sensing, and medical treatment. This dissertation focuses on three main areas: supervised representation learning for point cloud primitive segmentation, self-supervised learning methods, and transfer learning from 2D to 3D. Our approach, which integrates pre-trained 2D models to support 3D network training, significantly improves 3D understanding without merely transforming 2D data. Extensive experiments validate the effectiveness of our methods, showcasing their potential to advance point cloud representation learning by effectively integrating 2D knowledge.

Siming Yan• 2025

Related benchmarks

Task	Dataset	Result
Semantic segmentation	S3DIS (Area 5)	mIOU73.8	1006
3D Object Detection	ScanNet V2 (val)	mAP@0.2576.1	361
Semantic segmentation	S3DIS (6-fold)	mIoU (Mean IoU)78.3	344
Shape classification	ModelNet40 (test)	--	255
Object Classification	ScanObjectNN OBJ_BG	Accuracy92.5	248
3D Object Detection	ScanNet	mAP@0.2576.1	127
3D Object Part Segmentation	ShapeNet Part (test)	--	114
3D Object Detection	SUN RGB-D	mAP@0.2567.6	107
Point Cloud Classification	ScanObjectNN PB_T50_RS	Overall Accuracy87.7	100
3D Object Classification	ScanObjectNN PB_T50_RS	OA91	94

Showing 10 of 23 rows

Other info

Follow for update

@wizwand_team Discord