FactCorrector: A Graph-Inspired Approach to Long-Form Factuality Correction of Large Language Models
About
Large language models (LLMs) are widely used in knowledge-intensive applications but often generate factually incorrect responses. A promising approach to rectify these flaws is correcting LLMs using feedback. Therefore, in this paper, we introduce FactCorrector, a new post-hoc correction method that adapts across domains without retraining and leverages structured feedback about the factuality of the original response to generate a correction. To support rigorous evaluations of factuality correction methods, we also develop the VELI5 benchmark, a novel dataset containing systematically injected factual errors and ground-truth corrections. Experiments on VELI5 and several popular long-form factuality datasets show that the FactCorrector approach significantly improves factual precision while preserving relevance, outperforming strong baselines. We release our code at https://ibm.biz/factcorrector.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Factuality Correction | VELI5 | Mean Factual Precision0.97 | 64 | |
| Factuality Correction | Bio (test) | Precision46 | 44 | |
| Factuality Correction | ASKHIST | Mean Factual Precision0.92 | 40 | |
| Factual Correction | CONFLICTS | ROUGE97 | 25 | |
| Factuality Correction | VELI5 1.0 (test) | Precision (Pr)36 | 24 | |
| Factuality Correction | BIO dataset | Factual Precision91 | 24 | |
| Post-hoc Correction | ConflictBank (100 atomic claims) | ROUGE89 | 15 |