Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Simple data balancing achieves competitive worst-group-accuracy

About

We study the problem of learning classifiers that perform well across (known or unknown) groups of data. After observing that common worst-group-accuracy datasets suffer from substantial imbalances, we set out to compare state-of-the-art methods to simple balancing of classes and groups by either subsampling or reweighting data. Our results show that these data balancing baselines achieve state-of-the-art-accuracy, while being faster to train and requiring no additional hyper-parameters. In addition, we highlight that access to group information is most critical for model selection purposes, and not so much during training. All in all, our findings beg closer examination of benchmarks and methods for research in worst-group-accuracy optimization.

Badr Youbi Idrissi, Martin Arjovsky, Mohammad Pezeshki, David Lopez-Paz• 2021

Related benchmarks

TaskDatasetResultRank
Image ClassificationWaterbirds (test)
Worst-Group Accuracy88.87
112
ClassificationCelebA (test)--
92
Natural Language InferenceMultiNLI (test)--
81
Attribute ClassificationCelebA (test)--
60
ClassificationCivilComments (test)
Worst-case Accuracy78.9
47
RegressionDissecting Health Bias
MSE0.165
24
RegressionSwiss Asylum Seekers
MSE0.097
24
RegressionUK Asylum Decisions
MSE0.186
24
Predictive Modeling across GroupsSwiss Asylum Seekers
RWA38.4
24
Predictive Modeling across GroupsEducation
RWA0.941
24
Showing 10 of 21 rows

Other info

Follow for update