PRISM: Reducing Spurious Implicit Biases in Vision-Language Models with LLM-Guided Embedding Projection

About

We introduce Projection-based Reduction of Implicit Spurious bias in vision-language Models (PRISM), a new data-free and task-agnostic solution for bias mitigation in VLMs like CLIP. VLMs often inherit and amplify biases in their training data, leading to skewed predictions. PRISM is designed to debias VLMs without relying on predefined bias categories or additional external data. It operates in two stages: first, an LLM is prompted with simple class prompts to generate scene descriptions that contain spurious correlations. Next, PRISM uses our novel contrastive-style debiasing loss to learn a projection that maps the embeddings onto a latent space that minimizes spurious correlations while preserving the alignment between image and text embeddings.Extensive experiments demonstrate that PRISM outperforms current debiasing methods on the commonly used Waterbirds and CelebA datasets We make our code public at: https://github.com/MahdiyarMM/PRISM.

Mahdiyar Molahasani, Azadeh Motamedi, Michael Greenspan, Il-Min Kim, Ali Etemad• 2025

Related benchmarks

Task	Dataset	Result
Classification	CelebA	--	197
Social Bias Evaluation	FairFace	MS0.245	54
Bias Mitigation for Stereotype Queries	UTKFACE Race	KL Divergence0.133	33
Bias Mitigation for Stereotype Queries	UTKFACE Gender	KL Divergence0.088	33
Image Classification	FACET	Macro F169.2	27
Classification	Waterbirds Background (test)	Accuracy91.8	24
Zero-shot classification fairness	Waterbirds Background	Accuracy (Zero-shot)88.6	24
Classification	CelebA Gender (test)	Accuracy86.3	24
Image Retrieval	CelebA Hair Color queries	KL Divergence0.06	24
Image Retrieval	CelebA Stereotype queries	KL Divergence0.061	24

Showing 10 of 11 rows

Other info

Follow for update

@wizwand_team Discord