The Devil is in the Channels: Mutual-Channel Loss for Fine-Grained Image Classification

About

Key for solving fine-grained image categorization is finding discriminate and local regions that correspond to subtle visual traits. Great strides have been made, with complex networks designed specifically to learn part-level discriminate feature representations. In this paper, we show it is possible to cultivate subtle details without the need for overly complicated network designs or training mechanisms -- a single loss is all it takes. The main trick lies with how we delve into individual feature channels early on, as opposed to the convention of starting from a consolidated feature map. The proposed loss function, termed as mutual-channel loss (MC-Loss), consists of two channel-specific components: a discriminality component and a diversity component. The discriminality component forces all feature channels belonging to the same class to be discriminative, through a novel channel-wise attention mechanism. The diversity component additionally constraints channels so that they become mutually exclusive on spatial-wise. The end result is therefore a set of feature channels that each reflects different locally discriminative regions for a specific class. The MC-Loss can be trained end-to-end, without the need for any bounding-box/part annotations, and yields highly discriminative regions during inference. Experimental results show our MC-Loss when implemented on top of common base networks can achieve state-of-the-art performance on all four fine-grained categorization datasets (CUB-Birds, FGVC-Aircraft, Flowers-102, and Stanford-Cars). Ablative studies further demonstrate the superiority of MC-Loss when compared with other recently proposed general-purpose losses for visual classification, on two different base networks. Code available at https://github.com/dongliangchang/Mutual-Channel-Loss

Dongliang Chang, Yifeng Ding, Jiyang Xie, Ayan Kumar Bhunia, Xiaoxu Li, Zhanyu Ma, Ming Wu, Jun Guo, Yi-Zhe Song• 2020

Related benchmarks

Task	Dataset	Result
Fine-grained Image Classification	CUB200 2011 (test)	Accuracy87.3	567
Fine-grained Image Classification	Stanford Cars (test)	Accuracy93.7	372
Fine-grained Image Classification	CUB-200 2011	Accuracy87.3	314
Fine-grained visual classification	FGVC-Aircraft (test)	Top-1 Acc92.9	312
Fine-grained Image Classification	Stanford Cars	Accuracy94.4	284
Fine-grained Visual Categorization	Stanford Cars (test)	Accuracy94.4	114
Fine grained classification	Aircraft	Top-1 Acc92.9	72
Fine-grained visual classification	FGVC Aircraft	Top-1 Accuracy92.6	51
Fine-grained Image Classification	FGVC Aircraft	Accuracy (All)92.9	50
Fine-grained Image Classification	Oxford Flowers	Accuracy97.7	49

Showing 10 of 14 rows

Other info

Code

Follow for update

@wizwand_team Discord