Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Decoupled Kullback-Leibler Divergence Loss

About

In this paper, we delve deeper into the Kullback-Leibler (KL) Divergence loss and mathematically prove that it is equivalent to the Decoupled Kullback-Leibler (DKL) Divergence loss that consists of 1) a weighted Mean Square Error (wMSE) loss and 2) a Cross-Entropy loss incorporating soft labels. Thanks to the decomposed formulation of DKL loss, we have identified two areas for improvement. Firstly, we address the limitation of KL/DKL in scenarios like knowledge distillation by breaking its asymmetric optimization property. This modification ensures that the $\mathbf{w}$MSE component is always effective during training, providing extra constructive cues. Secondly, we introduce class-wise global information into KL/DKL to mitigate bias from individual samples. With these two enhancements, we derive the Improved Kullback-Leibler (IKL) Divergence loss and evaluate its effectiveness by conducting experiments on CIFAR-10/100 and ImageNet datasets, focusing on adversarial training, and knowledge distillation tasks. The proposed approach achieves new state-of-the-art adversarial robustness on the public leaderboard -- RobustBench and competitive performance on knowledge distillation, demonstrating the substantial practical merits. Our code is available at https://github.com/jiequancui/DKL.

Jiequan Cui, Zhuotao Tian, Zhisheng Zhong, Xiaojuan Qi, Bei Yu, Hanwang Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (val)--
661
Image ClassificationImageNet (val)--
300
Adversarial RobustnessCIFAR-10 (test)--
76
Image ClassificationCaltech256
Accuracy (Clean)52
51
Image ClassificationStanfordCars
Clean Accuracy14.2
40
Image ClassificationFGVC Aircraft
Clean Accuracy3.96
22
Image ClassificationCIFAR10
Clean Accuracy65.31
21
Image ClassificationTinyImageNet
Clean Accuracy70.84
17
Image ClassificationImageNet (val)
Top-1 Acc72.84
14
Image ClassificationFlowers102
Accuracy (Clean)25.94
11
Showing 10 of 24 rows

Other info

Code

Follow for update