Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Coreset Sampling from Open-Set for Fine-Grained Self-Supervised Learning

About

Deep learning in general domains has constantly been extended to domain-specific tasks requiring the recognition of fine-grained characteristics. However, real-world applications for fine-grained tasks suffer from two challenges: a high reliance on expert knowledge for annotation and necessity of a versatile model for various downstream tasks in a specific domain (e.g., prediction of categories, bounding boxes, or pixel-wise annotations). Fortunately, the recent self-supervised learning (SSL) is a promising approach to pretrain a model without annotations, serving as an effective initialization for any downstream tasks. Since SSL does not rely on the presence of annotation, in general, it utilizes the large-scale unlabeled dataset, referred to as an open-set. In this sense, we introduce a novel Open-Set Self-Supervised Learning problem under the assumption that a large-scale unlabeled open-set is available, as well as the fine-grained target dataset, during a pretraining phase. In our problem setup, it is crucial to consider the distribution mismatch between the open-set and target dataset. Hence, we propose SimCore algorithm to sample a coreset, the subset of an open-set that has a minimum distance to the target dataset in the latent space. We demonstrate that SimCore significantly improves representation learning performance through extensive experimental settings, including eleven fine-grained datasets and seven open-sets in various downstream tasks.

Sungnyun Kim, Sangmin Bae, Se-Young Yun• 2023

Related benchmarks

TaskDatasetResultRank
ClassificationCars
Accuracy60.29
314
Image ClassificationAircraft
Accuracy48.45
302
Image ClassificationPets
Accuracy81.75
204
Image ClassificationFlowers
Accuracy87.04
127
Image ClassificationFood
Accuracy91.31
92
Image ClassificationBird
Accuracy39.21
29
ClassificationTexture
Accuracy67.66
17
Image ClassificationDogs
Accuracy66.82
16
Image ClassificationAction
Accuracy67.46
2
Image ClassificationIndoor
Accuracy71.95
2
Showing 10 of 11 rows

Other info

Code

Follow for update