Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SecondPose: SE(3)-Consistent Dual-Stream Feature Fusion for Category-Level Pose Estimation

About

Category-level object pose estimation, aiming to predict the 6D pose and 3D size of objects from known categories, typically struggles with large intra-class shape variation. Existing works utilizing mean shapes often fall short of capturing this variation. To address this issue, we present SecondPose, a novel approach integrating object-specific geometric features with semantic category priors from DINOv2. Leveraging the advantage of DINOv2 in providing SE(3)-consistent semantic features, we hierarchically extract two types of SE(3)-invariant geometric features to further encapsulate local-to-global object-specific information. These geometric features are then point-aligned with DINOv2 features to establish a consistent object representation under SE(3) transformations, facilitating the mapping from camera space to the pre-defined canonical space, thus further enhancing pose estimation. Extensive experiments on NOCS-REAL275 demonstrate that SecondPose achieves a 12.4% leap forward over the state-of-the-art. Moreover, on a more complex dataset HouseCat6D which provides photometrically challenging objects, SecondPose still surpasses other competitors by a large margin.

Yamei Chen, Yan Di, Guangyao Zhai, Fabian Manhardt, Chenyangguang Zhang, Ruida Zhang, Federico Tombari, Nassir Navab, Benjamin Busam• 2023

Related benchmarks

TaskDatasetResultRank
6D Pose and Size EstimationREAL275
5°5cm0.636
50
Category-level 6D Object Pose EstimationNOCS REAL275
IoU@7550
8
Category-level 6D Object Pose EstimationShapeNet-C (test)
Rotation Mean Error (°)45.77
7
3D Object Pose EstimationHouseCat6D (test)
Overall IoU 2583.7
5
6D Object Pose EstimationHouseCat6D 19
IoU@7524.9
4
Showing 5 of 5 rows

Other info

Code

Follow for update