Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SalGAN: Visual Saliency Prediction with Generative Adversarial Networks

About

We introduce SalGAN, a deep convolutional neural network for visual saliency prediction trained with adversarial examples. The first stage of the network consists of a generator model whose weights are learned by back-propagation computed from a binary cross entropy (BCE) loss over downsampled versions of the saliency maps. The resulting prediction is processed by a discriminator network trained to solve a binary classification task between the saliency maps generated by the generative stage and the ground truth ones. Our experiments show how adversarial training allows reaching state-of-the-art performance across different metrics when combined with a widely-used loss function like BCE. Our results can be reproduced with the source code and trained models available at https://imatge-upc.github.io/saliency-salgan-2017/.

Junting Pan, Cristian Canton Ferrer, Kevin McGuinness, Noel E. O'Connor, Jordi Torres, Elisa Sayrol, Xavier Giro-i-Nieto• 2017

Related benchmarks

TaskDatasetResultRank
Video saliency predictionDHF1K (test)
AUC-J0.866
89
Video saliency predictionHollywood-2 (test)
SIM0.393
83
Video saliency predictionUCF Sports (test)
SIM0.332
71
Saliency PredictionMIT300 (test)
CC0.73
56
Visual Saliency PredictionSALICON (test)
CC0.781
12
Saliency PredictionDHF1K
Model Size (MB)130
12
Affordance GroundingOPRA 28 x 28 (test)
KLD2.12
11
Affordance GroundingEPIC-Hotspots 28 x 28 (test)
KLD1.51
10
Grounded affordance predictionOPRA (seen classes)
KLD2.116
9
Affordance GroundingOPRA (test)
KLD2.116
9
Showing 10 of 17 rows

Other info

Follow for update