Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Context Encoders: Feature Learning by Inpainting

About

We present an unsupervised visual feature learning algorithm driven by context-based pixel prediction. By analogy with auto-encoders, we propose Context Encoders -- a convolutional neural network trained to generate the contents of an arbitrary image region conditioned on its surroundings. In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s). When training context encoders, we have experimented with both a standard pixel-wise reconstruction loss, as well as a reconstruction plus an adversarial loss. The latter produces much sharper results because it can better handle multiple modes in the output. We found that a context encoder learns a representation that captures not just appearance but also the semantics of visual structures. We quantitatively demonstrate the effectiveness of our learned features for CNN pre-training on classification, detection, and segmentation tasks. Furthermore, context encoders can be used for semantic inpainting tasks, either stand-alone or as initialization for non-parametric methods.

Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, Alexei A. Efros• 2016

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPASCAL VOC 2012 (val)
Mean IoU29.7
2040
Image ClassificationImageNet-1K 1.0 (val)
Top-1 Accuracy78.4
1866
Image ClassificationImageNet-1k (val)
Top-1 Accuracy21
1453
Semantic segmentationPASCAL VOC 2012 (test)
mIoU29.7
1342
Object DetectionPASCAL VOC 2007 (test)
mAP44.5
821
Image ClassificationMNIST
Accuracy96.39
395
ClassificationPASCAL VOC 2007 (test)
mAP (%)56.5
217
Image ClassificationImageNet 2012 (val)
Top-1 Accuracy22.3
202
Semantic segmentationPASCAL VOC 2012
mIoU29.7
187
Semantic segmentationPascal VOC
mIoU0.297
172
Showing 10 of 41 rows

Other info

Follow for update