Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

UNIC: Universal Classification Models via Multi-teacher Distillation

About

Pretrained models have become a commodity and offer strong results on a broad range of tasks. In this work, we focus on classification and seek to learn a unique encoder able to take from several complementary pretrained models. We aim at even stronger generalization across a variety of classification tasks. We propose to learn such an encoder via multi-teacher distillation. We first thoroughly analyse standard distillation when driven by multiple strong teachers with complementary strengths. Guided by this analysis, we gradually propose improvements to the basic distillation setup. Among those, we enrich the architecture of the encoder with a ladder of expendable projectors, which increases the impact of intermediate features during distillation, and we introduce teacher dropping, a regularization mechanism that better balances the teachers' influence. Our final distillation strategy leads to student models of the same capacity as any of the teachers, while retaining or improving upon the performance of the best teacher for each task. Project page and code: https://europe.naverlabs.com/unic

Mert Bulent Sariyildiz, Philippe Weinzaepfel, Thomas Lucas, Diane Larlus, Yannis Kalantidis• 2024

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K
mIoU48.3
936
Semantic segmentationPascal Context
mIoU81.82
111
Semantic segmentationNYUD v2
mIoU58.56
96
Semantic segmentationPascal Context
mIoU81.82
43
Saliency DetectionPascal Context
maxF Score81.84
21
Surface Normal EstimationPascal Context
Mean Error (MAE)15.78
21
Surface Normal EstimationNYUD
mErr19.34
21
Semantic segmentationNYUD
mIoU58.56
17
Depth EstimationNYU V2
RMSE0.4916
15
Human ParsingPascal Context
mIoU72.24
11
Showing 10 of 14 rows

Other info

Follow for update