Multi-Persona Thinking for Bias Mitigation in Large Language Models
About
Large Language Models (LLMs) exhibit social biases, which can lead to harmful stereotypes and unfair outcomes. We propose \textbf{Multi-Persona Thinking (MPT)}, a simple inference-time framework that reduces social bias by encouraging reasoning from multiple perspectives. MPT guides the model to consider contrasting social identities, such as male and female, together with a neutral viewpoint. These viewpoints then interact through an iterative reasoning process to identify and correct biased judgments. This design transforms the potential weakness of persona assignment into a mechanism for bias mitigation. We evaluate MPT on two widely used bias benchmarks with both open-source and closed-source models across different scales. Results show that MPT achieves lower bias than existing prompting-based methods while maintaining core reasoning ability.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Question Answering | BBQ | -- | 36 | |
| Question Answering | BBQ (test) | Accuracy (amb)98.46 | 20 |