Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Exposing Hidden Biases in Text-to-Image Models via Automated Prompt Search

About

Text-to-image (TTI) diffusion models have achieved remarkable visual quality, yet they have been repeatedly shown to exhibit social biases across sensitive attributes such as gender, race and age. To mitigate these biases, existing approaches frequently depend on curated prompt datasets - either manually constructed or generated with large language models (LLMs) - as part of their training and/or evaluation procedures. Beside the curation cost, this also risks overlooking unanticipated, less obvious prompts that trigger biased generation, even in models that have undergone debiasing. In this work, we introduce Bias-Guided Prompt Search (BGPS), a framework that automatically generates prompts that aim to maximize the presence of biases in the resulting images. BGPS comprises two components: (1) an LLM instructed to produce attribute-neutral prompts and (2) attribute classifiers acting on the TTI's internal representations that steer the decoding process of the LLM toward regions of the prompt space that amplify the image attributes of interest. We conduct extensive experiments on Stable Diffusion 1.5 and a state-of-the-art debiased model and discover an array of subtle and previously undocumented biases that severely deteriorate fairness metrics. Crucially, the discovered prompts are interpretable, i.e they may be entered by a typical user, quantitatively improving the perplexity metric compared to a prominent hard prompt optimization counterpart. Our findings uncover TTI vulnerabilities, while BGPS expands the bias search space and can act as a new evaluation tool for bias mitigation.

Manos Plitsis, Giorgos Bouritsas, Vassilis Katsouros, Yannis Panagakis• 2025

Related benchmarks

TaskDatasetResultRank
Bias discoveryFemale-biased prompts
Female Proportion74
42
White-biased prompt discoveryWhite-biased prompts
White Score82
18
Biased Prompt DiscoveryBlack-biased prompts
Black Bias Proportion7
18
Bias EvaluationMale-biased prompts
Male Bias (Base)0.66
14
Showing 4 of 4 rows

Other info

Follow for update