Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FoundationPose: Unified 6D Pose Estimation and Tracking of Novel Objects

About

We present FoundationPose, a unified foundation model for 6D object pose estimation and tracking, supporting both model-based and model-free setups. Our approach can be instantly applied at test-time to a novel object without fine-tuning, as long as its CAD model is given, or a small number of reference images are captured. We bridge the gap between these two setups with a neural implicit representation that allows for effective novel view synthesis, keeping the downstream pose estimation modules invariant under the same unified framework. Strong generalizability is achieved via large-scale synthetic training, aided by a large language model (LLM), a novel transformer-based architecture, and contrastive learning formulation. Extensive evaluation on multiple public datasets involving challenging scenarios and objects indicate our unified approach outperforms existing methods specialized for each task by a large margin. In addition, it even achieves comparable results to instance-level methods despite the reduced assumptions. Project page: https://nvlabs.github.io/FoundationPose/

Bowen Wen, Wei Yang, Jan Kautz, Stan Birchfield• 2023

Related benchmarks

TaskDatasetResultRank
6DoF Pose EstimationYCB-Video (test)--
72
6D Object Pose EstimationLineMOD
Average Accuracy99.9
50
6D Object Pose EstimationBOP 7 core datasets: LM-O, T-LESS, TUD-L, IC-BIN, ITODD, HB, YCB-V 82 (test)
AR (LM-O)75.6
47
Pose EstimationBOP benchmark 2019 (test)
LM-O AR78.8
43
6D Pose TrackingYCB-Video (All Frames)
AUC (ADD)96
14
6D Pose Estimationoccluded YCB-Video (test)
ADD-S97.4
8
6D Object Pose TrackingYCBInEOAT (test)--
7
6D Object Pose EstimationGeneral Inference Efficiency Benchmark (test)
Inference Time (s)2.7
6
Object Pose RefinementLM-O (test)
MSPD86
5
6DoF Pose EstimationNovel 3D bin dataset 1.0 (test)
eTE (cm)5.603
4
Showing 10 of 11 rows

Other info

Code

Follow for update