Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DINO as a von Mises-Fisher mixture model

About

Self-distillation methods using Siamese networks are popular for self-supervised pre-training. DINO is one such method based on a cross-entropy loss between $K$-dimensional probability vectors, obtained by applying a softmax function to the dot product between representations and learnt prototypes. Given the fact that the learned representations are $L^2$-normalized, we show that DINO and its derivatives, such as iBOT, can be interpreted as a mixture model of von Mises-Fisher components. With this interpretation, DINO assumes equal precision for all components when the prototypes are also $L^2$-normalized. Using this insight we propose DINO-vMF, that adds appropriate normalization constants when computing the cluster assignment probabilities. Unlike DINO, DINO-vMF is stable also for the larger ViT-Base model with unnormalized prototypes. We show that the added flexibility of the mixture model is beneficial in terms of better image representations. The DINO-vMF pre-trained model consistently performs better than DINO on a range of downstream tasks. We obtain similar improvements for iBOT-vMF vs iBOT and thereby show the relevance of our proposed modification also for other methods derived from DINO.

Hariprasath Govindarajan, Per Sid\'en, Jacob Roll, Fredrik Lindsten• 2024

Related benchmarks

TaskDatasetResultRank
Image ClassificationImageNet-1k (val)--
1453
Video Object SegmentationDAVIS 2017 (val)
J mean61.9
1130
Image ClassificationImageNet-1k (val)
Top-1 Accuracy59.1
840
Image ClassificationCIFAR-100--
622
Image ClassificationImageNet-1k (val)--
512
Image ClassificationImageNet (val)
Accuracy74.14
300
Image ClassificationImageNet-1K
Accuracy84.1
190
Image RetrievalRevisited Oxford (ROxf) (Medium)
mAP38.1
124
Image RetrievalRevisited Paris (RPar) (Hard)
mAP39.5
115
Image ClassificationImageNet 1K (train val)
Top-1 Accuracy51.6
107
Showing 10 of 24 rows

Other info

Code

Follow for update