SCARF: Self-Supervised Contrastive Learning using Random Feature Corruption
About
Self-supervised contrastive representation learning has proved incredibly successful in the vision and natural language domains, enabling state-of-the-art performance with orders of magnitude less labeled data. However, such methods are domain-specific and little has been done to leverage this technique on real-world tabular datasets. We propose SCARF, a simple, widely-applicable technique for contrastive learning, where views are formed by corrupting a random subset of features. When applied to pre-train deep neural networks on the 69 real-world, tabular classification datasets from the OpenML-CC18 benchmark, SCARF not only improves classification accuracy in the fully-supervised setting but does so also in the presence of label noise and in the semi-supervised setting where only a fraction of the available training data is labeled. We show that SCARF complements existing strategies and outperforms alternatives like autoencoders. We conduct comprehensive ablations, detailing the importance of a range of factors.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Image Classification | FashionMNIST (test) | Accuracy48.34 | 363 | |
| Classification | HI | Accuracy0.56 | 59 | |
| Multiclass Classification | CMC | Accuracy37.75 | 41 | |
| Binary Classification | dresses-sales (DS) (test) | AUROC66.3 | 40 | |
| Binary Classification | cylinder-bands (CB) (test) | AUROC0.719 | 40 | |
| Binary Classification | income IC 1995 (test) | AUROC0.905 | 39 | |
| Classification | CNAE high-dimensional and sparse (test) | Accuracy59.59 | 39 | |
| Credit approval prediction | Credit Approval dataset (test) | AUROC0.861 | 37 | |
| Classification | Devnagari dev (test) | Accuracy11.82 | 36 | |
| Aggregate Tabular Benchmarking | Aggregate | Avg Rank8.56 | 33 |