Prompting Techniques for Reducing Social Bias in LLMs through System 1 and System 2 Cognitive Processes

About

Dual process theory posits that human cognition arises via two systems. System 1, which is a quick, emotional, and intuitive process, which is subject to cognitive biases, and System 2, is a slow, onerous, and deliberate process. Prior research in LLMs found that using chain-of-thought (CoT) prompting in LLMs, which has been often compared to System 2 reasoning, can lead to reduced gender bias. Along these lines, we investigate the relationship between bias, CoT prompting, a direct debiasing, and dual process theory modeling in LLMs. We compare zero-shot CoT, debiasing, and dual process theory-based prompting strategies on two bias datasets spanning nine different social bias categories. We incorporate human and machine personas to determine whether LLM modeling of the effects of dual process theory exist independent of explicit persona models or are tied to the LLM's modeling of human-like generation. We find that a human persona, debiasing, System 2, and CoT prompting all tend to reduce social biases in LLMs, though the best combination of features depends on the exact model and bias category -- resulting in up to a 33 percent drop in stereotypical judgments by an LLM.

Mahammed Kamruzzaman, Gene Louis Kim• 2024

Related benchmarks

Task	Dataset	Result
In-Context Value Alignment	Value Composition (Overall)	Confucianism Score3.331	37
Evaluation-based Bias Reduction	Bias Reduction Benchmark (Evaluation)	Bias Reduction Performance99.8	35
Memory-based Bias Reduction	Bias Reduction Benchmark Memory	Bias Reduction Performance50.4	35
Memory Fidelity Evaluation	Memory-based Experiment Seen Features	P-Diff0.03	32
In-hospital mortality prediction	MIMIC-IV (test)	AUROC78.5	27
Value Alignment	Confucianism-4	Conformity Score3.46	22
Value Alignment	HH Balance-8	Conformity Score3.939	17
Value Alignment	Helpfulness 4	Conformity Score4.335	16
Value Alignment	Harmlessness 4	Conformity Score3.907	16
Value Alignment	Liberalism 4	Conformity Score2.732	11

Showing 10 of 10 rows

Other info

Follow for update

@wizwand_team Discord