RAC: Efficient LLM Factuality Correction with Retrieval Augmentation
About
Large Language Models (LLMs) exhibit impressive results across a wide range of natural language processing (NLP) tasks, yet they can often produce factually incorrect outputs. This paper introduces a simple but effective low-latency post-correction method, \textbf{Retrieval Augmented Correction (RAC)}, aimed at enhancing the factual performance of LLMs without requiring additional fine-tuning. Our method is general and can be used with any instruction-tuned LLM, and has greatly reduced latency compared to prior approaches. RAC decomposes the LLM's output into atomic facts and applies a fine-grained verification and correction process with retrieved content to verify and correct the LLM-generated output. Our extensive experiments show that RAC yields up to 30\% improvements over state-of-the-art baselines across two popular factuality evaluation datasets, validating its efficacy and robustness in both with and without the integration of Retrieval-Augmented Generation (RAG) across different LLMs.\footnote{Our code is at \url{https://github.com/jlab-nlp/Retrieval-Augmented-Correction}}
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Factuality Correction | VELI5 | Mean Factual Precision0.91 | 64 | |
| Factuality Correction | Bio (test) | Precision51 | 44 | |
| Factuality Correction | ASKHIST | Mean Factual Precision0.92 | 40 | |
| Factual Correction | CONFLICTS | ROUGE87 | 25 | |
| Factuality Correction | BIO dataset | Factual Precision91 | 24 | |
| Factuality Correction | VELI5 1.0 (test) | Precision (Pr)34 | 24 | |
| Post-hoc Correction | ConflictBank (100 atomic claims) | ROUGE87 | 15 |