RGBD-GAN: Unsupervised 3D Representation Learning From Natural Image Datasets via RGBD Image Synthesis

About

Understanding three-dimensional (3D) geometries from two-dimensional (2D) images without any labeled information is promising for understanding the real world without incurring annotation cost. We herein propose a novel generative model, RGBD-GAN, which achieves unsupervised 3D representation learning from 2D images. The proposed method enables camera parameter-conditional image generation and depth image generation without any 3D annotations, such as camera poses or depth. We use an explicit 3D consistency loss for two RGBD images generated from different camera parameters, in addition to the ordinal GAN objective. The loss is simple yet effective for any type of image generator such as DCGAN and StyleGAN to be conditioned on camera parameters. Through experiments, we demonstrated that the proposed method could learn 3D representations from 2D images with various generator architectures.

Atsuhiro Noguchi, Tatsuya Harada• 2019

Related benchmarks

Task	Dataset	Result
Image Synthesis	FFHQ (test)	FID11.6	8
Image Synthesis	Oxford Flowers (test)	KID12.04	7
Image Synthesis	CUB-200-2011 (test)	KID14.92	7
Image Generation	FFHQ	KID6.73	4
Image Generation	CUB-200 2011	KID14.92	4
Image Generation	Oxford Flowers	KID12.04	4
Unsupervised Depth and Defocus Learning	Oxford Flowers	KID12.04	4
Unsupervised Depth and Defocus Learning	CUB-200 2011	KID14.92	4
Unsupervised Depth and Defocus Learning	FFHQ	KID6.73	4
Unsupervised Depth Estimation	Oxford Flowers	SIDE7.01	2

Showing 10 of 12 rows

Other info

Follow for update

@wizwand_team Discord