Debiasing Vision-Language Models via Biased Prompts

About

Machine learning models have been shown to inherit biases from their training datasets. This can be particularly problematic for vision-language foundation models trained on uncurated datasets scraped from the internet. The biases can be amplified and propagated to downstream applications like zero-shot classifiers and text-to-image generative models. In this study, we propose a general approach for debiasing vision-language foundation models by projecting out biased directions in the text embedding. In particular, we show that debiasing only the text embedding with a calibrated projection matrix suffices to yield robust classifiers and fair generative models. The proposed closed-form solution enables easy integration into large-scale pipelines, and empirical results demonstrate that our approach effectively reduces social bias and spurious correlation in both discriminative and generative vision-language models without the need for additional data or training.

Ching-Yao Chuang, Varun Jampani, Yuanzhen Li, Antonio Torralba, Stefanie Jegelka• 2023

Related benchmarks

Task	Dataset	Result
Text-to-Image Retrieval	Flickr30K	R@179.02	559
Image Classification	Waterbirds	Average Accuracy86.31	209
Classification	CelebA	--	197
Zero-shot Image Classification	ImageNet-1K	Top-1 Accuracy0.7753	115
Social Bias Evaluation	FairFace	MS0.288	54
Text-to-Image Retrieval	COCO 2017	Recall@590.8	43
Bias Mitigation for Stereotype Queries	UTKFACE Gender	KL Divergence0.03	33
Bias Mitigation for Stereotype Queries	UTKFACE Race	KL Divergence0.182	33
Social Debiasing	Fairface Out-of-Domain	MaxSkew (MS)0.094	32
Social Debiasing	FACET Out-of-Domain	MS0.417	32

Showing 10 of 36 rows

Other info

Follow for update

@wizwand_team Discord