Cross-view image synthesis using geometry-guided conditional GANs

About

We address the problem of generating images across two drastically different views, namely ground (street) and aerial (overhead) views. Image synthesis by itself is a very challenging computer vision task and is even more so when generation is conditioned on an image in another view. Due the difference in viewpoints, there is small overlapping field of view and little common content between these two views. Here, we try to preserve the pixel information between the views so that the generated image is a realistic representation of cross view input image. For this, we propose to use homography as a guide to map the images between the views based on the common field of view to preserve the details in the input image. We then use generative adversarial networks to inpaint the missing regions in the transformed image and add realism to it. Our exhaustive evaluation and model comparison demonstrate that utilizing geometry constraints adds fine details to the generated images and can be a better approach for cross view image synthesis than purely pixel based synthesis methods.

Krishna Regmi, Ali Borji• 2018

Related benchmarks

Task	Dataset	Result
Aerial-to-Ground Image Translation	CVUSA (test)	Top-1 Accuracy0.29	10
aerial-to-ground synthesis	SVA (test)	Inception Score (all)2.6328	9
Cross-view Image Translation (aerial-to-ground)	Dayton (test)	Top-1 Accuracy27.56	9
Cross-view Image Synthesis	Dayton 64 x 64	Top-1 Accuracy16.63	8
Cross-view Image Synthesis	Dayton 256 x 256	Top-1 Accuracy30.16	8
Aerial-to-Ground Image Synthesis	SVA	User Preference Score0.264	7
Aerial-to-Ground Image Synthesis	CVUSA	FID89.12	5
aerial-to-ground synthesis	CVUSA	SSIM0.4356	5
Cross-view Image Synthesis	CVUSA	Top-1 Accuracy20.58	5
Aerial-to-Ground Image Synthesis	Dayton 64 x 64	FID227.2	4

Showing 10 of 17 rows

Other info

Code

Follow for update

@wizwand_team Discord