Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SEEK: Semantic Evidence Extraction via Adaptive ChunKing for Multilingual Fact-Checking

About

Multilingual fact verification requires evidence that is both relevant and sufficiently complete for reliable factuality prediction. However, existing systems often rely on search snippets, sentence-level evidence, or locally segmented passages, which can miss decisive context and produce fragmented evidence. To overcome these limitations, we propose SEEK, a Semantic Evidence Extraction with an adaptive chunKing framework that constructs coherent evidence chunks from full fact-checking articles by identifying semantic topic transitions and preserving local verification context. The constructed chunks are encoded using a multilingual encoder and then multilingual LLMs are finetuned using LoRA adapter for veracity prediction. Experiments on X-FACT and RU22Fact show that SEEK improves macro-f1 by up to 10% over semantic chunking, 19% over sentence chunking, and 20% over search-snippet baselines. Evidence completeness and significance analyses further show that SEEK preserves richer verification context and enables more reliable multilingual fact-checking.

Babu Kumar, Gaurav Kumar, Ayush Garg, Aditya Kishore, Jasabanta Patro• 2026

Related benchmarks

TaskDatasetResultRank
Veracity PredictionRu22fact (test)
MF190
25
Fact VerificationX-Fact In-Domain (ID)
Macro-F167
15
Fact VerificationX-Fact Out-of-Domain (OOD)
Macro-F141
15
Fact VerificationX-Fact Zero-Shot (ZS)
Macro-F130
15
Showing 4 of 4 rows

Other info

Follow for update