Learning Category-Specific Mesh Reconstruction from Image Collections

About

We present a learning framework for recovering the 3D shape, camera, and texture of an object from a single image. The shape is represented as a deformable 3D mesh model of an object category where a shape is parameterized by a learned mean shape and per-instance predicted deformation. Our approach allows leveraging an annotated image collection for training, where the deformable model and the 3D prediction mechanism are learned without relying on ground-truth 3D or multi-view supervision. Our representation enables us to go beyond existing 3D prediction approaches by incorporating texture inference as prediction of an image in a canonical appearance space. Additionally, we show that semantic keypoints can be easily associated with the predicted shapes. We present qualitative and quantitative results of our approach on CUB and PASCAL3D datasets and show that we can learn to predict diverse shapes and textures across objects using only annotated image collections. The project website can be found at https://akanazawa.github.io/cmr/.

Angjoo Kanazawa, Shubham Tulsiani, Alexei A. Efros, Jitendra Malik• 2018

Related benchmarks

Task	Dataset	Result
Keypoint Transfer	CUB Bird (test)	PCK@0.154.6	8
Single-image 3D Reconstruction	CUB bird dataset unseen (test)	Mask IoU (%)73.8	8
Monocular Non-Rigid 3D Reconstruction	CUB 2011 (test)	mIoU0.703	7
3D Reconstruction	PASCAL3D+ Car	mIoU64	7
3D Reconstruction	CUB 41 (test)	mIoU73.8	6
Keypoint Transfer	CUB Bird excluding 50 aquatic bird classes (test)	PCK@0.159.1	6
3D Reconstruction	Pascal3D+ Car (test)	mIoU64	6
3D Reconstruction	PASCAL3D+ aeroplane (test)	mIoU46.8	5
3D shape regression	CUB (test)	PCK0543.2	3
3D Reconstruction	CUB bird dataset	Texture L1 Loss0.043	2

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord