Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Correlation Congruence for Knowledge Distillation

About

Most teacher-student frameworks based on knowledge distillation (KD) depend on a strong congruent constraint on instance level. However, they usually ignore the correlation between multiple instances, which is also valuable for knowledge transfer. In this work, we propose a new framework named correlation congruence for knowledge distillation (CCKD), which transfers not only the instance-level information, but also the correlation between instances. Furthermore, a generalized kernel method based on Taylor series expansion is proposed to better capture the correlation between instances. Empirical experiments and ablation studies on image classification tasks (including CIFAR-100, ImageNet-1K) and metric learning tasks (including ReID and Face Recognition) show that the proposed CCKD substantially outperforms the original KD and achieves state-of-the-art accuracy compared with other SOTA KD-based methods. The CCKD can be easily deployed in the majority of the teacher-student framework such as KD and hint-based learning methods.

Baoyun Peng, Xiao Jin, Jiaheng Liu, Shunfeng Zhou, Yichao Wu, Yu Liu, Dongsheng Li, Zhaoning Zhang• 2019

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)
Accuracy73.56
3518
Image ClassificationImageNet-1k (val)--
1453
Image ClassificationImageNet-1K
Top-1 Acc70.79
836
Image ClassificationTinyImageNet (test)
Accuracy36.43
366
Image ClassificationSTL-10 (test)
Accuracy69.13
357
Image ClassificationImageNet (val)--
300
Image ClassificationImageNet (val)--
188
Image ClassificationCIFAR100
Average Accuracy73.56
121
Image ClassificationDomainNet
Average Accuracy33.86
58
Video ClassificationKinetics-400 v1 (val)
Top-1 Acc68.52
35
Showing 10 of 26 rows

Other info

Follow for update