Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mosaicking to Distill: Knowledge Distillation from Out-of-Domain Data

About

Knowledge distillation~(KD) aims to craft a compact student model that imitates the behavior of a pre-trained teacher in a target domain. Prior KD approaches, despite their gratifying results, have largely relied on the premise that \emph{in-domain} data is available to carry out the knowledge transfer. Such an assumption, unfortunately, in many cases violates the practical setting, since the original training data or even the data domain is often unreachable due to privacy or copyright reasons. In this paper, we attempt to tackle an ambitious task, termed as \emph{out-of-domain} knowledge distillation~(OOD-KD), which allows us to conduct KD using only OOD data that can be readily obtained at a very low cost. Admittedly, OOD-KD is by nature a highly challenging task due to the agnostic domain gap. To this end, we introduce a handy yet surprisingly efficacious approach, dubbed as~\textit{MosaicKD}. The key insight behind MosaicKD lies in that, samples from various domains share common local patterns, even though their global semantic may vary significantly; these shared local patterns, in turn, can be re-assembled analogous to mosaic tiling, to approximate the in-domain data and to further alleviating the domain discrepancy. In MosaicKD, this is achieved through a four-player min-max game, in which a generator, a discriminator, a student network, are collectively trained in an adversarial manner, partially under the guidance of a pre-trained teacher. We validate MosaicKD over {classification and semantic segmentation tasks} across various benchmarks, and demonstrate that it yields results much superior to the state-of-the-art counterparts on OOD data. Our code is available at \url{https://github.com/zju-vipa/MosaicKD}.

Gongfan Fang, Yifan Bao, Jie Song, Xinchao Wang, Donglin Xie, Chengchao Shen, Mingli Song• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationNYU v2 (test)
mIoU45.4
248
Image ClassificationStanford Dogs (test)
Top-1 Acc28.02
85
Image ClassificationCIFAR-100 original (test)--
71
Image ClassificationCUB-200 (test)
Accuracy26.11
62
Image ClassificationImageNet 32x32 1000 categories (test)
Test Accuracy26.51
13
Image ClassificationCIFAR-100 OOD (test)
Test Accuracy77.01
6
Image ClassificationCIFAR-100 Data-Free (test)--
1
Showing 7 of 7 rows

Other info

Code

Follow for update