Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Deep Sliding Shapes for Amodal 3D Object Detection in RGB-D Images

About

We focus on the task of amodal 3D object detection in RGB-D images, which aims to produce a 3D bounding box of an object in metric form at its full extent. We introduce Deep Sliding Shapes, a 3D ConvNet formulation that takes a 3D volumetric scene from a RGB-D image as input and outputs 3D object bounding boxes. In our approach, we propose the first 3D Region Proposal Network (RPN) to learn objectness from geometric shapes and the first joint Object Recognition Network (ORN) to extract geometric features in 3D and color features in 2D. In particular, we handle objects of various sizes by training an amodal RPN at two different scales and an ORN to regress 3D bounding boxes. Experiments show that our algorithm outperforms the state-of-the-art by 13.8 in mAP and is 200x faster than the original Sliding Shapes. All source code and pre-trained models will be available at GitHub.

Shuran Song, Jianxiong Xiao• 2015

Related benchmarks

TaskDatasetResultRank
3D Object DetectionScanNet V2 (val)
mAP@0.2515.2
352
3D Object DetectionSUN RGB-D (val)
mAP@0.2542.1
158
3D Object DetectionSUN RGB-D
mAP@0.2542.1
104
3D Object DetectionSUN RGB-D v1 (val)
mAP@0.2542.1
81
3D Object DetectionScanNet V2
AP506.8
54
Object DetectionNYUD v2 (test)
Mean AP (b)72.3
24
3D Object DetectionScanNet v2 (test)
mAP@0.56.8
23
3D Object DetectionSUN RGB-D v1 (test)
Bed AP78.8
18
3D Object DetectionSUN-RGBD (val)
Bathtub AP44.2
17
3D Object DetectionSUN-RGBD (test)
AP (bathtub)44.2
7
Showing 10 of 14 rows

Other info

Code

Follow for update