TSM-Pose: Topology-Aware Learning with Semantic Mamba for Category-Level Object Pose Estimation

About

Category-level object pose estimation is fundamental for embodied intelligence, yet achieving robust generalization to unseen instances remains challenging. However, existing methods mainly rely on simple feature extraction and aggregation, which struggle to capture category-shared topological structures and conduct semantic keypoint modeling, limiting their generalization. To address these, we propose a \textbf{T}opology-Aware Learning with \textbf{S}emantic \textbf{M}amba for Category-Level \textbf{P}ose Estimation framework (TSM-Pose). Specifically, we introduce a Topology Extractor to capture the global topological representation of the point cloud, which is integrated into local geometry features and enables robust category-level structural representation. Simultaneously, we propose a Mamba-based Global Semantic Aggregator that injects semantics priors into keypoints to enhance their expressiveness and leverages multiple TwinMamba blocks to model long-range dependencies for more effective global feature aggregation. Extensive experiments on three benchmark datasets (REAL275, CAMERA25, and HouseCat6D) demonstrate that TSM-Pose outperforms existing state-of-the-art methods.

Jinshuo Liu, Bingtao Ma, Junlin Su, Guanyuan Pan, Beining Wu, Cheng Yang, Jiaxuan Lu, Chenggang Yan, Shuai Wang• 2026

Related benchmarks

Task	Dataset	Result
Category-level 6D Object Pose Estimation	REAL275	mAP (10°5cm)88.7	43
Category-level Object Pose Estimation	CAMERA25	Success@5deg_2cm79.9	31
6D Object Pose Estimation	HouseCat6D (test)	IoU2590.2	6

Showing 3 of 3 rows

Other info

Follow for update

@wizwand_team Discord