Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SceneComplete: Open-World 3D Scene Completion in Cluttered Real World Environments for Robot Manipulation

About

Careful robot manipulation in every-day cluttered environments requires an accurate understanding of the 3D scene, in order to grasp and place objects stably and reliably and to avoid colliding with other objects. In general, we must construct such a 3D interpretation of a complex scene based on limited input, such as a single RGB-D image. We describe SceneComplete, a system for constructing a complete, segmented, 3D model of a scene from a single view. SceneComplete is a novel pipeline for composing general-purpose pretrained perception modules (vision-language, segmentation, image-inpainting, image-to-3D, visual-descriptors and pose-estimation) to obtain highly accurate results. We demonstrate its accuracy and effectiveness with respect to ground-truth models in a large benchmark dataset and show that its accurate whole-object reconstruction enables robust grasp proposal generation, including for a dexterous hand. We release the code and additional results on our website.

Aditya Agarwal, Gaurav Singh, Bipasha Sen, Tom\'as Lozano-P\'erez, Leslie Pack Kaelbling• 2024

Related benchmarks

TaskDatasetResultRank
6D Object Pose Estimation and Surface ReconstructionHB
CD_norm0.234
6
6D Object Pose Estimation and Surface ReconstructionReOcS
CD_norm0.764
6
6D Object Pose Estimation and Surface ReconstructionLMO
CD_norm0.186
6
6D Part Pose Estimation and Surface ReconstructionArtVIP
CD_norm0.189
6
Showing 4 of 4 rows

Other info

Follow for update