Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Classification Datasets

Benchmarks

Task NameDataset NameSOTA ResultTrend
Concept Extraction Evaluation4 classification datasets average
RAcc99.8
35
Tabular Classification53 classification datasets (unseen)
Mean Accuracy75.42
18
Zero-shot ClassificationClassification Datasets (MMLU, OBQA, ARC-e, WinoGrande, ARC-c, PIQA, HellaSwag)
MMLU (5-shot)37.1
18
Classification80 classification datasets
Median Effect Size (F1 pts)0.11
17
Open-Vocabulary Classification11 classification datasets (test)
ImageNet Accuracy76.77
16
Classificationmedium-sized classification datasets
Accuracy78.58
14
Classification7 classification datasets (Iris, Wine, Breast Cancer, Digits, etc.) (cross-validation)
Accuracy91.17
10
Tabular Classification50 classification datasets
Mean Accuracy84.36
10
Classification6 out-of-domain classification datasets (test)
Accuracy65.2
9
Tabular Data GenerationClassification Datasets
Avg. JSD0.05
2
Classification15 Classification Datasets
TabMixNN Wins1,067
1
Showing 11 of 11 rows