Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

NeurOCS: Neural NOCS Supervision for Monocular 3D Object Localization

About

Monocular 3D object localization in driving scenes is a crucial task, but challenging due to its ill-posed nature. Estimating 3D coordinates for each pixel on the object surface holds great potential as it provides dense 2D-3D geometric constraints for the underlying PnP problem. However, high-quality ground truth supervision is not available in driving scenes due to sparsity and various artifacts of Lidar data, as well as the practical infeasibility of collecting per-instance CAD models. In this work, we present NeurOCS, a framework that uses instance masks and 3D boxes as input to learn 3D object shapes by means of differentiable rendering, which further serves as supervision for learning dense object coordinates. Our approach rests on insights in learning a category-level shape prior directly from real driving scenes, while properly handling single-view ambiguities. Furthermore, we study and make critical design choices to learn object coordinates more effectively from an object-centric view. Altogether, our framework leads to new state-of-the-art in monocular 3D localization that ranks 1st on the KITTI-Object benchmark among published monocular methods.

Zhixiang Min, Bingbing Zhuang, Samuel Schulter, Buyu Liu, Enrique Dunn, Manmohan Chandraker• 2023

Related benchmarks

TaskDatasetResultRank
3D Object DetectionKITTI car (test)
AP3D (Easy)29.89
195
3D Object DetectionKITTI official (test)
3D AP (Easy)29.89
43
BEV Object DetectionKITTI official (test)
AP40 Easy37.5
22
3D Object DetectionKITTI official (val)
AP40 Easy31.31
21
BEV Object DetectionKITTI official (val)
Easy AP4039.26
13
Showing 5 of 5 rows

Other info

Follow for update