Fact-Checking Complex Claims with Program-Guided Reasoning
About
Fact-checking real-world claims often requires collecting multiple pieces of evidence and applying complex multi-step reasoning. In this paper, we present Program-Guided Fact-Checking (ProgramFC), a novel fact-checking model that decomposes complex claims into simpler sub-tasks that can be solved using a shared library of specialized functions. We first leverage the in-context learning ability of large language models to generate reasoning programs to guide the verification process. Afterward, we execute the program by delegating each sub-task to the corresponding sub-task handler. This process makes our model both explanatory and data-efficient, providing clear explanations of its reasoning process and requiring minimal training data. We evaluate ProgramFC on two challenging fact-checking datasets and show that it outperforms seven fact-checking baselines across different settings of evidence availability, with explicit output programs that benefit human debugging. Our codes and data are publicly available at https://github.com/mbzuai-nlp/ProgramFC.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Fact Checking | FEVEROUS (test) | Macro F168.06 | 20 | |
| Fact Checking | HOVER 3-hop (test) | Macro F163.43 | 16 | |
| Fact Checking | HOVER 2-hop (test) | Macro F170.3 | 16 | |
| Fact Checking | HOVER 4-hop (test) | Macro F159.16 | 16 | |
| Scientific Fact Verification | SciFact | Macro F10.7182 | 16 | |
| Fact Checking | HOVER | Macro F1 (2-hop)69.78 | 12 | |
| Fact Checking | FEVEROUS-S | Macro F165.59 | 12 | |
| Multi-hop Fact Verification | HOVER 2-hop | Macro F171 | 7 | |
| Multi-hop Fact Verification | HOVER 3-hop | Macro F151 | 7 | |
| Multi-hop Fact Verification | HOVER 4-hop | Macro-F153 | 7 |