Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DebiasRAG: A Tuning-Free Path to Fair Generation in Large Language Models through Retrieval-Augmented Generation

About

Large language models (LLMs) have achieved unprecedented success due to their exceptional generative capabilities. However, because they depend on knowledge encapsulated from training corpora, they may produce hallucinations, stereotypes, and socially biased content. In particular, LLMs are prone to prejudiced responses involving race, gender, and age, which are collectively referred to as social biases. Prior studies have used fine-tuning and prompt engineering to mitigate such biases in LLMs, but these methods require additional training resources or domain knowledge to design the framework. Moreover, they may degrade the original capabilities of LLMs and often overlook the need for dynamic debiasing contexts for fairer inference. In this paper, we propose DebiasRAG, a novel tuning-free and dynamic query-specific debiasing framework based on retrieval-augmented generation (RAG). DebiasRAG improves fairness while preserving the intrinsic properties of LLMs, such as representation ability. DebiasRAG consists of three stages: (1) query-specific debiasing candidate generation; (2) context candidate pool construction; and (3) gradient-updated debiasing-guided context piece reranking. First, DebiasRAG leverages self-diagnosed bias contexts relevant to the query through regular retrieval, where the bias contexts are prepared offline by the DebiasRAG provider. Given the query-specific bias contexts, DebiasRAG reversely produces debiasing contexts, which are provided as additional fairness constraints for LLM outputs. Second, a regular RAG retrieval process produces query-related contexts from the regular RAG document database, such as a chunked Wikipedia dataset.

Rui Chu, Bingyin Zhao, Thanh Quoc Hung Le, Duy Cao Hoang, Huawei Lin, Ping Li, Weijie Zhao, Khoa D Doan, Yingjie Lao• 2026

Related benchmarks

TaskDatasetResultRank
Stereotype Bias EvaluationStereoSet Gender
LMS Score92.98
24
Gender bias evaluationSEAT
SEAT 60.42
16
Stereotype Bias EvaluationStereoSet Overall
LMS91.05
8
Bias EvaluationCrowS-Pairs
CP-S Score41.38
6
Bias EvaluationRegard
Regard Score33.9
4
Bias EvaluationHBR
HBR Score96.7
4
Bias EvaluationTG2
TG2 Score21.2
4
Bias EvaluationaPS
APS20
4
Bias EvaluationBoLD--
4
Showing 9 of 9 rows

Other info

Follow for update