FACTS: First Amplify Correlations and Then Slice to Discover Bias

About

Computer vision datasets frequently contain spurious correlations between task-relevant labels and (easy to learn) latent task-irrelevant attributes (e.g. context). Models trained on such datasets learn "shortcuts" and underperform on bias-conflicting slices of data where the correlation does not hold. In this work, we study the problem of identifying such slices to inform downstream bias mitigation strategies. We propose First Amplify Correlations and Then Slice to Discover Bias (FACTS), wherein we first amplify correlations to fit a simple bias-aligned hypothesis via strongly regularized empirical risk minimization. Next, we perform correlation-aware slicing via mixture modeling in bias-aligned feature space to discover underperforming data slices that capture distinct correlations. Despite its simplicity, our method considerably improves over prior work (by as much as 35% precision@10) in correlation bias identification across a range of diverse evaluation settings. Our code is available at: https://github.com/yvsriram/FACTS.

Sriram Yenamandra, Pratik Ramesh, Viraj Prabhu, Judy Hoffman• 2023

Related benchmarks

Task	Dataset	Result
Image Classification	Waterbirds	WG Accuracy88.9	283
Image Classification	CelebA	WG Score60	62
Bias discovery	CelebA standard (test)	Precision@101	8
Bias discovery	Waterbirds standard (test)	Precision@10100	8
Bias discovery	NICO++ 75/90/95 (test)	Precision@10 (75% Strength)0.6	6
Error slice discovery	FeSD	Precision@100.28	3

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord