Mono3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors

About

We present a method to infer 3D pose and shape of vehicles from a single image. To tackle this ill-posed problem, we optimize two-scale projection consistency between the generated 3D hypotheses and their 2D pseudo-measurements. Specifically, we use a morphable wireframe model to generate a fine-scaled representation of vehicle shape and pose. To reduce its sensitivity to 2D landmarks, we jointly model the 3D bounding box as a coarse representation which improves robustness. We also integrate three task priors, including unsupervised monocular depth, a ground plane constraint as well as vehicle shape priors, with forward projection errors into an overall energy function.

Tong He, Stefano Soatto• 2019

Related benchmarks

Task	Dataset	Result
3D Object Detection	KITTI (val)	--	85
3D Object Detection	KITTI (val)	--	57
Bird's Eye View 3D Object Detection	KITTI (val1)	AP_BEV (IoU=0.5, Easy)46.7	17
Monocular 3D Object Detection	KITTI (val)	AP_R11 (Moderate)7.9	17

Showing 4 of 4 rows

Other info

Follow for update

@wizwand_team Discord