Learning to Predict 3D Objects with an Interpolation-based Differentiable Renderer

About

Many machine learning models operate on images, but ignore the fact that images are 2D projections formed by 3D geometry interacting with light, in a process called rendering. Enabling ML models to understand image formation might be key for generalization. However, due to an essential rasterization step involving discrete assignment operations, rendering pipelines are non-differentiable and thus largely inaccessible to gradient-based ML techniques. In this paper, we present {\emph DIB-R}, a differentiable rendering framework which allows gradients to be analytically computed for all pixels in an image. Key to our approach is to view foreground rasterization as a weighted interpolation of local properties and background rasterization as a distance-based aggregation of global geometry. Our approach allows for accurate optimization over vertex positions, colors, normals, light directions and texture coordinates through a variety of lighting models. We showcase our approach in two ML applications: single-image 3D object prediction, and 3D textured object generation, both trained using exclusively using 2D supervision. Our project website is: https://nv-tlabs.github.io/DIB-R/

Wenzheng Chen, Jun Gao, Huan Ling, Edward J. Smith, Jaakko Lehtinen, Alec Jacobson, Sanja Fidler• 2019

Related benchmarks

Task	Dataset	Result
3D Object Reconstruction	ShapeNet (test)	Mean IoU0.612	80
3D Reconstruction from a single 2D image	ShapeNet (test)	Volumetric IoU (Airplane)57	11
Single-image 3D Reconstruction	CUB bird dataset unseen (test)	Mask IoU (%)75.7	8
3D Reconstruction	PASCAL3D+ Car	mIoU80	7
3D Reconstruction	CUB 41 (test)	mIoU75.7	6
3D Object Reconstruction	ShapeNet Car	L1 Loss (Texture)0.0218	2
3D Reconstruction	CUB bird dataset	Texture L1 Loss0.043	2
3D Reconstruction	Original View Images	LPIPS0.33	2

Showing 8 of 8 rows

Other info

Code

Follow for update

@wizwand_team Discord