





























Abstract:Multilingual fact verification requires evidence that is both relevant and sufficiently complete for reliable factuality prediction. However, existing systems often rely on search snippets, sentence-level evidence, or locally segmented passages, which can miss decisive context and produce fragmented evidence. To overcome these limitations, we propose SEEK, a Semantic Evidence Extraction with an adaptive chunKing framework that constructs coherent evidence chunks from full fact-checking articles by identifying semantic topic transitions and preserving local verification context. The constructed chunks are encoded using a multilingual encoder and then multilingual LLMs are finetuned using LoRA adapter for veracity prediction. Experiments on X-FACT and RU22Fact show that SEEK improves macro-f1 by up to 10% over semantic chunking, 19% over sentence chunking, and 20% over search-snippet baselines. Evidence completeness and significance analyses further show that SEEK preserves richer verification context and enables more reliable multilingual fact-checking.
| Subjects: | Computation and Language (cs.CL) |
| Cite as: | arXiv:2605.26755 [cs.CL] |
| (or arXiv:2605.26755v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.26755 arXiv-issued DOI via DataCite (pending registration) |
From: Gaurav Kumar [view email]
[v1]
Tue, 26 May 2026 09:27:30 UTC (4,267 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。