Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

SocraticKG: Knowledge Graph Construction via QA-Driven Fact Extraction

About

Constructing Knowledge Graphs (KGs) from unstructured text provides a structured framework for knowledge representation and reasoning, yet current LLM-based approaches struggle with a fundamental trade-off: factual coverage often leads to relational fragmentation, while premature consolidation causes information loss. To address this, we propose SocraticKG, an automated KG construction method that introduces question-answer pairs as a structured intermediate representation to systematically unfold document-level semantics prior to triple extraction. By employing 5W1H-guided QA expansion, SocraticKG captures contextual dependencies and implicit relational links typically lost in direct KG extraction pipelines, providing explicit grounding in the source document that helps mitigate implicit reasoning errors. Evaluation on the MINE benchmark demonstrates that our approach effectively addresses the coverage-connectivity trade-off, achieving superior factual retention while maintaining high structural cohesion even as extracted knowledge volume substantially expands. These results highlight that QA-mediated semantic scaffolding plays a critical role in structuring semantics prior to KG extraction, enabling more coherent and reliable graph construction in subsequent stages.

Sanghyeok Choi, Woosang Jeon, Kyuseok Yang, Taehyeong Kim• 2026

Related benchmarks

TaskDatasetResultRank
Factual RetentionMINE
Factual Retention (%)96.3
25
Knowledge Graph ConstructionMINE benchmark 100 articles
Mean Node Count104.2
25
Knowledge Graph ExtractionMINE benchmark 1.0 (100 articles) (test)
NFI0.106
25
Showing 3 of 3 rows

Other info

Follow for update