Point-DAE: Denoising Autoencoders for Self-supervised Point Cloud Learning

About

Masked autoencoder has demonstrated its effectiveness in self-supervised point cloud learning. Considering that masking is a kind of corruption, in this work we explore a more general denoising autoencoder for point cloud learning (Point-DAE) by investigating more types of corruptions beyond masking. Specifically, we degrade the point cloud with certain corruptions as input, and learn an encoder-decoder model to reconstruct the original point cloud from its corrupted version. Three corruption families (\ie, density/masking, noise, and affine transformation) and a total of fourteen corruption types are investigated with traditional non-Transformer encoders. Besides the popular masking corruption, we identify another effective corruption family, \ie, affine transformation. The affine transformation disturbs all points globally, which is complementary to the masking corruption where some local regions are dropped. We also validate the effectiveness of affine transformation corruption with the Transformer backbones, where we decompose the reconstruction of the complete point cloud into the reconstructions of detailed local patches and rough global shape, alleviating the position leakage problem in the reconstruction. Extensive experiments on tasks of object classification, few-shot learning, robustness testing, part segmentation, and 3D object detection validate the effectiveness of the proposed method. The codes are available at \url{https://github.com/YBZh/Point-DAE}.

Yabin Zhang, Jiehong Lin, Ruihuang Li, Kui Jia, Lei Zhang• 2022

Related benchmarks

Task	Dataset	Result
Part Segmentation	ShapeNetPart (test)	--	347
3D Object Classification	ScanObjectNN PB_T50_RS	OA88.7	94
3D Object Classification	ScanObjectNN OBJ_ONLY	Overall Accuracy93.1	83
Classification	ScanObjectNN	OA88.8	67
3D Classification	ScanObjectNN OBJ-BG	Top-1 Acc93.9	42
object recognition	ModelNet40 5-way	Accuracy98.3	40
object recognition	ModelNet40 10-way	Accuracy95	30
object recognition	ScanObjectNN fully-supervised (PB)	Overall Accuracy (OA)86.4	28
object recognition	ModelNet40 fully-supervised (test)	Overall Accuracy (OA)94	26
object recognition	ScanObjectNN fully-supervised (BG)	Overall Accuracy (OA)91.2	24

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord