Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Decoupled Kullback-Leibler Divergence Loss

About

In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and mathematically prove that it is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels. Thanks to the decomposed formulation of DKL loss, we have identified two areas for improvement. Firstly, we address the limitation of KL/DKL in scenarios like knowledge distillation by breaking its asymmetric optimization property. This modification ensures that the $\mathbf{w}$MSE component is always effective during training, providing extra constructive cues. Secondly, we introduce class-wise global information into KL/DKL to mitigate bias from individual samples. With these two enhancements, we derive the Improved Kullback-Leibler (IKL) Divergence loss and evaluate its effectiveness by conducting experiments on CIFAR-10/100 and ImageNet datasets, focusing on adversarial training, and knowledge distillation tasks. The proposed approach achieves new state-of-the-art adversarial robustness on the public leaderboard -- RobustBench and competitive performance on knowledge distillation, demonstrating the substantial practical merits. Our code is available at https://github.com/jiequancui/DKL.

Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (val)--
776
Image ClassificationImageNet (val)--
300
Image ClassificationStanfordCars--
91
Adversarial RobustnessCIFAR-10 (test)--
76
Image ClassificationCaltech256
Accuracy (Clean)52
69
Image ClassificationFGVC Aircraft--
41
Image ClassificationCIFAR10
Clean Accuracy65.31
21
Image ClassificationFlowers102
Accuracy (Clean)25.94
20
Image ClassificationImageNet (val)
Clean Accuracy67.15
18
Image ClassificationImageNet (val)
Clean Accuracy67.15
18
Showing 10 of 31 rows

Other info

Code

Follow for update