OSOP: A Multi-Stage One Shot Object Pose Estimation Framework

About

We present a novel one-shot method for object detection and 6 DoF pose estimation, that does not require training on target objects. At test time, it takes as input a target image and a textured 3D query model. The core idea is to represent a 3D model with a number of 2D templates rendered from different viewpoints. This enables CNN-based direct dense feature extraction and matching. The object is first localized in 2D, then its approximate viewpoint is estimated, followed by dense 2D-3D correspondence prediction. The final pose is computed with PnP. We evaluate the method on LineMOD, Occlusion, Homebrewed, YCB-V and TLESS datasets and report very competitive performance in comparison to the state-of-the-art methods trained on synthetic data, even though our method is not trained on the object models used for testing.

Ivan Shugurov, Fu Li, Benjamin Busam, Slobodan Ilic• 2022

Related benchmarks

Task	Dataset	Result
6D Object Pose Estimation	BOP 7 core datasets: LM-O, T-LESS, TUD-L, IC-BIN, ITODD, HB, YCB-V 82 (test)	AR (LM-O)31.2	47
Pose Estimation	BOP benchmark 2019 (test)	LM-O AR48.2	43
6D Pose Estimation	BOP challenge	LM-O48.2	39
6-DoF Pose Estimation	YCB-V BOP challenge 2020	AR57.2	37
6D Pose Estimation	Homebrewed BOP challenge (test)	Avg Recall60.5	20
6D Pose Estimation	Occlusion dataset BOP challenge (test)	AR48.2	19
6-DoF Pose Estimation	Linemod RGB synthetic 11 (train)	ADD39.3	8
6-DoF Pose Estimation	Linemod RGBD synthetic 11 (train)	ADD73.3	7
2D Object Detection	LM BOP 14 (test)	Precision47	3
2D Object Detection	LMO BOP 14 (test)	Precision31	3

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord