Follow the Path: Reasoning over Knowledge Graph Paths to Improve Large Language Model Factuality

About

We introduce fs1, a simple yet effective method that improves the factuality of reasoning traces by collecting them from large reasoning models and grounding them in knowledge graph (KG) paths. We fine-tune eight instruction-tuned Large Language Models (LLMs) on 3.9K factually grounded reasoning traces and rigorously evaluate them on six complex open-domain question-answering (QA) benchmarks encompassing 23.9K questions. Our results demonstrate that our fs1-tuned model consistently outperforms instruction-tuned counterparts with parallel sampling by 6-14 absolute points (pass@16). Our detailed analysis shows that fs1 considerably improves model performance over more complex questions (requiring 3 or more hops on KG paths) and numerical answer types compared to the baselines. Furthermore, in single-pass inference, we notice that smaller LLMs show the most improvements. While prior works demonstrate the effectiveness of reasoning traces primarily in the STEM domains, our work shows strong evidence that anchoring reasoning to factual KG paths is a critical step in transforming LLMs for reliable knowledge-intensive tasks.

Mike Zhang, Johannes Bjerva, Russa Biswas• 2025

Related benchmarks

Task	Dataset	Result
Multi-hop Knowledge Graph Question Answering	GrailQA	Hits@134.4	68
Multi-hop Question Answering	CWQ	Pass@147.7	36
Multi-hop Question Answering	Mintaka	Pass@168.2	36
Multi-hop Question Answering	ExaQT	Pass@136.1	36
Multi-hop Question Answering	WebQSP	Pass@157.6	36
Multi-hop Question Answering	SimpleQA	Pass@17.9	36

Showing 6 of 6 rows

Other info

Follow for update

@wizwand_team Discord