DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition
About
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy64.96 | 536 | |
| Image Classification | CUB-200-2011 (test) | -- | 276 | |
| Domain Adaptation | Office-31 unsupervised adaptation standard | Accuracy (A to W)61.6 | 162 | |
| Image Classification | Office-10 + Caltech-10 | Average Accuracy84 | 77 | |
| object recognition | Office (standard) | Accuracy (A to W)53.9 | 55 | |
| Classification | Caltech101 (test) | Accuracy86.91 | 33 | |
| Dynamic Scene Recognition | YUPENN (leave-one-out) | Accuracy96.7 | 12 | |
| Image Classification | OFFICE DSLR → Webcam (test) | Accuracy91.5 | 8 | |
| Image Classification | OFFICE Amazon → Webcam (test) | Accuracy52.2 | 8 | |
| Multi-class classification | Office standard evaluation | Accuracy (A->W)0.807 | 7 |