Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning

About

Self-supervised learning algorithms, including BERT and SimCLR, have enabled significant strides in fields like natural language processing, computer vision, and speech processing. However, these algorithms are domain-specific, meaning that new self-supervised learning algorithms must be developed for each new setting, including myriad healthcare, scientific, and multimodal domains. To catalyze progress toward domain-agnostic methods, we introduce DABS: a Domain-Agnostic Benchmark for Self-supervised learning. To perform well on DABS, an algorithm is evaluated on seven diverse domains: natural images, multichannel sensor data, English text, speech recordings, multilingual text, chest x-rays, and images with text descriptions. Each domain contains an unlabeled dataset for pretraining; the model is then is scored based on its downstream performance on a set of labeled tasks in the domain. We also present e-Mix and ShED: two baseline domain-agnostic algorithms; their relatively modest performance demonstrates that significant progress is needed before self-supervised learning is an out-of-the-box solution for arbitrary domains. Code for benchmark datasets and baseline algorithms is available at https://github.com/alextamkin/dabs.

Alex Tamkin, Vincent Liu, Rongfei Lu, Daniel Fein, Colin Schultz, Noah Goodman• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationAircraft
Accuracy2.7
302
ClassificationCUB
Accuracy1.6
85
ClassificationDTD
Accuracy7.4
22
ClassificationGoogle commands
Accuracy4.9
13
Visual Question AnsweringVQA
Accuracy53.4
12
ClassificationFluent Loc
Accuracy62.1
6
ClassificationSCOP
Accuracy8
6
ClassificationGenomics (Genom)
Accuracy37.2
6
ClassificationMismatched-caption
Accuracy49.8
6
ClassificationGenomics OOD
Accuracy8.6
6
Showing 10 of 32 rows

Other info

Code

Follow for update