Floors are Flat: Leveraging Semantics for Real-Time Surface Normal Prediction

About

We propose 4 insights that help to significantly improve the performance of deep learning models that predict surface normals and semantic labels from a single RGB image. These insights are: (1) denoise the "ground truth" surface normals in the training set to ensure consistency with the semantic labels; (2) concurrently train on a mix of real and synthetic data, instead of pretraining on synthetic and finetuning on real; (3) jointly predict normals and semantics using a shared model, but only backpropagate errors on pixels that have valid training labels; (4) slim down the model and use grayscale instead of color inputs. Despite the simplicity of these steps, we demonstrate consistently improved results on several datasets, using a model that runs at 12 fps on a standard mobile phone.

Steven Hickson, Karthik Raveendran, Alireza Fathi, Kevin Murphy, Irfan Essa• 2019

Related benchmarks

Task	Dataset	Result	Rank
Surface Normal Estimation	NYU v2 (test)	Mean Angle Distance (MAD)17		224
Surface Normal Estimation	NYU proposed semantically corrected normals (test)	Acc (< 11.25°)59.5		1

Showing 2 of 2 rows

Other info

Follow for update

@wizwand_team Discord