
























Abstract:Large language models often reason beyond surface tokens, but the internal stage at which token-level information becomes abstract relational structure remains unclear. We investigate this question by analyzing how attention heads and layers transform information during autoregressive reasoning. Across mathematical and symbolic reasoning tasks, we observe a consistent layer-wise division of labor: outer layers mainly preserve and route input-related features, whereas middle layers reorganize them into more transferable rule-level representations. This interpretation is supported by representation geometry: middle-layer states occupy lower-dimensional manifolds and show stronger alignment across disjoint vocabularies that instantiate the same symbolic rules. It is further supported by causal interventions: removing middle-layer components identified by our interaction-based criterion produces substantially larger downstream changes and accuracy drops than removing components from other regions or at random. Together, these results suggest that abstract reasoning is not uniformly distributed across transformer layers, but is preferentially formed in a middle-layer computation stage that converts token-level information into reusable relational structure.
| Subjects: | Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2603.29735 [cs.AI] |
| (or arXiv:2603.29735v2 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2603.29735 arXiv-issued DOI via DataCite |
From: JunJie Zhang [view email]
[v1]
Tue, 31 Mar 2026 13:36:08 UTC (3,263 KB)
[v2]
Thu, 21 May 2026 04:48:54 UTC (2,648 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。