Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PointLAMA: Latent Attention meets Mamba for Efficient Point Cloud Pretraining

About

Mamba has recently gained widespread attention as a backbone model for point cloud modeling, leveraging a state-space architecture that enables efficient global sequence modeling with linear complexity. However, its lack of local inductive bias limits its capacity to capture fine-grained geometric structures in 3D data. To address this limitation, we propose \textbf{PointLAMA}, a point cloud pretraining framework that combines task-aware point cloud serialization, a hybrid encoder with integrated Latent Attention and Mamba blocks, and a conditional diffusion mechanism built upon the Mamba backbone. Specifically, the task-aware point cloud serialization employs Hilbert/Trans-Hilbert space-filling curves and axis-wise sorting to structurally align point tokens for classification and segmentation tasks, respectively. Our lightweight Latent Attention block features a Point-wise Multi-head Latent Attention (PMLA) module, which is specifically designed to align with the Mamba architecture by leveraging the shared latent space characteristics of PMLA and Mamba. This enables enhanced local context modeling while preserving overall efficiency. To further enhance representation learning, we incorporate a conditional diffusion mechanism during pretraining, which denoises perturbed feature sequences without relying on explicit point-wise reconstruction. Experimental results demonstrate that PointLAMA achieves competitive performance on multiple benchmark datasets with minimal parameter count and FLOPs, validating its effectiveness for efficient point cloud pretraining.

Xuanyu Lin, Xiaona Zeng, Xianwei Zheng, Xutao Li• 2025

Related benchmarks

TaskDatasetResultRank
Few-shot classificationModelNet40 10-way 10-shot
Accuracy94
79
Few-shot classificationModelNet40 5-way 20-shot
Accuracy99
79
Few-shot classificationModelNet40 10-way 20-shot
Accuracy95.8
79
Few-shot classificationModelNet40 5-way 10-shot
Accuracy97.2
79
3D Object ClassificationModelNet40 1k P
Accuracy94.5
61
3D Object ClassificationScanObjectNN PB_T50_RS (FULL Protocol)
Accuracy89.53
25
3D Object ClassificationScanObjectNN OBJ_BG (FULL Protocol)
Accuracy94.51
23
3D Object ClassificationScanObjectNN OBJ_ONLY FULL Protocol
Accuracy92.86
23
Showing 8 of 8 rows

Other info

Follow for update