Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Aligning Logits Generatively for Principled Black-Box Knowledge Distillation

About

Black-Box Knowledge Distillation (B2KD) is a formulated problem for cloud-to-edge model compression with invisible data and models hosted on the server. B2KD faces challenges such as limited Internet exchange and edge-cloud disparity of data distributions. In this paper, we formalize a two-step workflow consisting of deprivatization and distillation, and theoretically provide a new optimization direction from logits to cell boundary different from direct logits alignment. With its guidance, we propose a new method Mapping-Emulation KD (MEKD) that distills a black-box cumbersome model into a lightweight one. Our method does not differentiate between treating soft or hard responses, and consists of: 1) deprivatization: emulating the inverse mapping of the teacher function with a generator, and 2) distillation: aligning low-dimensional logits of the teacher and student models by reducing the distance of high-dimensional image points. For different teacher-student pairs, our method yields inspiring distillation performance on various benchmarks, and outperforms the previous state-of-the-art approaches.

Jing Ma, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li• 2022

Related benchmarks

TaskDatasetResultRank
Image ClassificationCIFAR-100 (test)--
3518
Image ClassificationMNIST (test)
Accuracy99.45
882
Image ClassificationImageNet-1k (val)
Top-1 Accuracy61.21
840
Image ClassificationImageNet-1K
Top-1 Acc61.21
836
Image ClassificationCIFAR-100
Top-1 Accuracy67.36
622
Image ClassificationCIFAR-10--
507
Image ClassificationMNIST--
395
Image ClassificationTinyImageNet (test)--
366
Image ClassificationTiny-ImageNet
Top-1 Accuracy54.93
143
Image ClassificationSVHN (test)
Top-1 Accuracy89.21
26
Showing 10 of 10 rows

Other info

Code

Follow for update