Category-Level 6D Object Pose Estimation in the Wild: A Semi-Supervised Learning Approach and A New Dataset
About
6D object pose estimation is one of the fundamental problems in computer vision and robotics research. While a lot of recent efforts have been made on generalizing pose estimation to novel object instances within the same category, namely category-level 6D pose estimation, it is still restricted in constrained environments given the limited number of annotated data. In this paper, we collect Wild6D, a new unlabeled RGBD object video dataset with diverse instances and backgrounds. We utilize this data to generalize category-level 6D object pose estimation in the wild with semi-supervised learning. We propose a new model, called Rendering for Pose estimation network RePoNet, that is jointly trained using the free ground-truths with the synthetic data, and a silhouette matching objective function on the real-world data. Without using any 3D annotations on real data, our method outperforms state-of-the-art methods on the previous dataset and our Wild6D test set (with manual annotations for evaluation) by a large margin. Project page with Wild6D data: https://oasisyang.github.io/semi-pose .
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| 6D Pose and Size Estimation | REAL275 | 5°5cm0.339 | 50 | |
| 3D Object Detection | REAL275 | -- | 12 | |
| Pose Estimation | NOCS (test) | mAP IoU 5081.1 | 10 | |
| Pose Estimation | NOCS REAL275 (test) | mAP (IoU=0.50)0.811 | 10 | |
| Shape Reconstruction | NOCS | Shape Error (Bottle)1.51 | 5 | |
| Shape Reconstruction | REAL275 (test) | Bottle Error (mm)1.51 | 5 |