Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Self-Supervised Learning as Discrete Communication

About

Most self-supervised learning (SSL) methods learn continuous visual representations by aligning different views of the same input, offering limited control over how information is structured across representation dimensions. In this work, we frame visual self-supervised learning as a discrete communication process between a teacher and a student network, where semantic information is transmitted through a fixed-capacity binary channel. Rather than aligning continuous features, the student predicts multi-label binary messages produced by the teacher. Discrete agreement is enforced through an element-wise binary cross-entropy objective, while a coding-rate regularization term encourages effective utilization of the constrained channel, promoting structured representations. We further show that periodically reinitializing the projection head strengthens this effect by encouraging embeddings that remain predictive across multiple discrete encodings. Extensive experiments demonstrate consistent improvements over continuous agreement baselines on image classification, retrieval, and dense visual prediction tasks, as well as under domain shift through self-supervised adaptation. Beyond backbone representations, we analyze the learned binary codes and show that they form a compact and informative discrete language, capturing semantic factors reusable across classes.

Kawtar Zaher, Ilyass Moummad, Olivier Buisson, Alexis Joly• 2026

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)
AP2.8
2454
Instance SegmentationCOCO 2017 (val)--
1144
Video Object SegmentationDAVIS 2017 (val)
J mean60.99
1130
Image ClassificationImageNet-1K
Top-1 Acc77.8
836
Fine grained classificationFood101
Accuracy82.88
30
Fine grained classificationiNaturalist-19--
24
Fine-grained Image ClassificationBirds-525 (B)
Accuracy96.72
14
Fine-grained RecognitionPlantNet300k
Accuracy80.04
8
Image RetrievalImageNet V2 (val)
mAP52.9
4
Image RetrievalImageNet100 (val)
mAP82.29
4
Showing 10 of 13 rows

Other info

Follow for update