

















Abstract:In many reasoning tasks, large language models (LLMs) rely on structured external knowledge, such as graphs and tables, which is typically linearized into sequential token representations. However, even when sufficient knowledge is available, LLMs can still produce hallucinated outputs, and the underlying mechanisms behind such failures remain poorly understood. We investigate these mechanisms and find that hallucinations arise from systematic internal dynamics rather than random noise. First, attention disproportionately concentrates toward shortcut-like structural cues rather than distributing across the full context. Second, feed-forward representations fail to ground the provided knowledge, causing the model to revert to parametric memory. Moreover, our results indicate that hallucination is consistently associated with failures in semantic grounding within feed-forward layers, while attention allocation exhibits greater task-dependent variability. Finally, we show that these mechanistic patterns generalize beyond single-hop graphs to multi-hop and tabular settings, enabling effective hallucination detection across structured knowledge formats.
| Comments: | To appear in Proceedings of ACL 2026 |
| Subjects: | Computation and Language (cs.CL); Artificial Intelligence (cs.AI) |
| Report number: | ACL 2026 |
| Cite as: | arXiv:2605.26362 [cs.CL] |
| (or arXiv:2605.26362v1 [cs.CL] for this version) | |
| https://doi.org/10.48550/arXiv.2605.26362 arXiv-issued DOI via DataCite (pending registration) |
From: Shanghao Li [view email]
[v1]
Mon, 25 May 2026 22:08:59 UTC (763 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。