Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models

About

Large vision-language model (LVLM) based text-to-image (T2I) systems have become the dominant paradigm in image generation, yet whether they amplify social biases remains insufficiently understood. In this paper, we show that LVLM-based models produce markedly more socially biased images than non-LVLM-based models. We introduce a 1,024 prompt benchmark spanning four levels of linguistic complexity and evaluate demographic bias across multiple attributes in a systematic manner. Our analysis identifies system prompts, the predefined instructions guiding LVLMs, as a primary driver of biased behavior. Through decoded intermediate representations, token-probability diagnostics, and embedding-association analyses, we reveal how system prompts encode demographic priors that propagate into image synthesis. To this end, we propose FairPro, a training-free meta-prompting framework that enables LVLMs to self-audit and construct fairness-aware system prompts at test time. Experiments on two LVLM-based T2I models, SANA and Qwen-Image, show that FairPro substantially reduces demographic bias while preserving text-image alignment. We believe our findings provide deeper insight into the central role of system prompts in bias propagation and offer a practical, deployable approach for building more socially responsible T2I systems.

NaHyeon Park, Namin An, Kunhee Kim, Soyeon Yoon, Jiahao Huo, Hyunjung Shim• 2025

Related benchmarks

TaskDatasetResultRank
Image Generation DiversityPrompt Complexity (Rewritten)
CLIP Similarity0.9235
6
Image Generation DiversityPrompt Complexity (Simple)
CLIP Score88.39
6
Image Generation DiversityPrompt Complexity (Context)
CLIP Similarity90.38
6
Image Generation DiversityPrompt Complexity Mean
CLIP Similarity0.8919
6
Image Generation DiversityPrompt Complexity (Occupation)
CLIP Score0.8563
6
Bias MeasurementFull Prompt Benchmark Aggregate (test)
Gender Bias Score81.6
6
Social Bias EvaluationTIBET
Gender87
6
Text-to-Image GenerationOccupation prompts
Bias0.746
4
Text-to-Image GenerationSimple prompts
Bias0.797
4
Text-to-Image GenerationContext prompts
Bias0.815
4
Showing 10 of 11 rows

Other info

Follow for update