Fine-grained Classification via Categorical Memory Networks
About
Motivated by the desire to exploit patterns shared across classes, we present a simple yet effective class-specific memory module for fine-grained feature learning. The memory module stores the prototypical feature representation for each category as a moving average. We hypothesize that the combination of similarities with respect to each category is itself a useful discriminative cue. To detect these similarities, we use attention as a querying mechanism. The attention scores with respect to each class prototype are used as weights to combine prototypes via weighted sum, producing a uniquely tailored response feature representation for a given input. The original and response features are combined to produce an augmented feature for classification. We integrate our class-specific memory module into a standard convolutional neural network, yielding a Categorical Memory Network. Our memory module significantly improves accuracy over baseline CNNs, achieving competitive accuracy with state-of-the-art methods on four benchmarks, including CUB-200-2011, Stanford Cars, FGVC Aircraft, and NABirds.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fine-grained Image Classification | CUB200 2011 (test) | Accuracy88.2 | 536 | |
| Image Classification | Stanford Cars (test) | Accuracy94.9 | 306 | |
| Fine-grained visual classification | FGVC-Aircraft (test) | Top-1 Acc93.8 | 287 | |
| Image Classification | CUB-200-2011 (test) | Top-1 Acc88.2 | 276 | |
| Fine-grained Image Classification | CUB-200 2011 | Accuracy88.2 | 222 | |
| Fine-grained Visual Categorization | Stanford Cars (test) | Accuracy94.9 | 110 | |
| Fine-grained Visual Categorization | FGVCAircraft | Accuracy93.8 | 60 | |
| Image Classification | NABirds | Accuracy87.8 | 37 |