PointLAMA: Latent Attention meets Mamba for Efficient Point Cloud Pretraining

About

Mamba has recently gained widespread attention as a backbone model for point cloud modeling, leveraging a state-space architecture that enables efficient global sequence modeling with linear complexity. However, its lack of local inductive bias limits its capacity to capture fine-grained geometric structures in 3D data. To address this limitation, we propose \textbf{PointLAMA}, a point cloud pretraining framework that combines task-aware point cloud serialization, a hybrid encoder with integrated Latent Attention and Mamba blocks, and a conditional diffusion mechanism built upon the Mamba backbone. Specifically, the task-aware point cloud serialization employs Hilbert/Trans-Hilbert space-filling curves and axis-wise sorting to structurally align point tokens for classification and segmentation tasks, respectively. Our lightweight Latent Attention block features a Point-wise Multi-head Latent Attention (PMLA) module, which is specifically designed to align with the Mamba architecture by leveraging the shared latent space characteristics of PMLA and Mamba. This enables enhanced local context modeling while preserving overall efficiency. To further enhance representation learning, we incorporate a conditional diffusion mechanism during pretraining, which denoises perturbed feature sequences without relying on explicit point-wise reconstruction. Experimental results demonstrate that PointLAMA achieves competitive performance on multiple benchmark datasets with minimal parameter count and FLOPs, validating its effectiveness for efficient point cloud pretraining.

Xuanyu Lin, Xiaona Zeng, Xianwei Zheng, Xutao Li• 2025

Related benchmarks

Task	Dataset	Result
Few-shot classification	ModelNet40 10-way 10-shot	Accuracy94	117
Few-shot classification	ModelNet40 10-way 20-shot	Accuracy95.8	117
Few-shot classification	ModelNet40 5-way 20-shot	Accuracy99	102
Few-shot classification	ModelNet40 5-way 10-shot	Accuracy97.2	102
3D Object Classification	ModelNet40 1k P	Accuracy94.5	61
3D Object Classification	ScanObjectNN PB_T50_RS (FULL Protocol)	Accuracy89.53	25
3D Object Classification	ScanObjectNN OBJ_BG (FULL Protocol)	Accuracy94.51	23
3D Object Classification	ScanObjectNN OBJ_ONLY FULL Protocol	Accuracy92.86	23

Showing 8 of 8 rows

Other info

Follow for update

@wizwand_team Discord