SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

About

We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples. The first stage of the network consists of a generator model whose weights are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency maps. The resulting prediction is processed by a discriminator network trained to solve a binary classification task between the saliency maps generated by the generative stage and the ground truth ones. Our experiments show how adversarial training allows reaching state-of-the-art performance across different metrics when combined with a widely-used loss function like BCE. Our results can be reproduced with the source code and trained models available at https://imatge-upc.github.io/saliency-salgan-2017/.

Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto• 2017

Related benchmarks

Task	Dataset	Result
Video saliency prediction	DHF1K (test)	AUC-J0.866	89
Video saliency prediction	Hollywood-2 (test)	SIM0.393	83
Video saliency prediction	UCF Sports (test)	SIM0.332	71
Saliency Prediction	MIT300 (test)	CC0.73	56
Visual Saliency Prediction	SALICON (test)	CC0.781	12
Saliency Prediction	DHF1K	Model Size (MB)130	12
Affordance Grounding	OPRA 28 x 28 (test)	KLD2.12	11
Affordance Grounding	EPIC-Hotspots 28 x 28 (test)	KLD1.51	10
Grounded affordance prediction	OPRA (seen classes)	KLD2.116	9
Affordance Grounding	OPRA (test)	KLD2.116	9

Showing 10 of 17 rows

Other info

Follow for update

@wizwand_team Discord