
























Abstract:Evaluating the quality of reasoning traces from large language models remains understudied, labor-intensive, and unreliable: current practice relies on expert rubrics, manual annotation, and slow pairwise judgments. Automated efforts are dominated by graph-based proxies that quantify structural connectivity but do not clarify what constitutes high-quality reasoning; such abstractions can be overly simplistic for inherently complex processes. We introduce a topological data analysis (TDA)-based evaluation framework that captures the geometry of reasoning traces and enables label-efficient, automated assessment. In our empirical study, topological features yield substantially higher predictive power for assessing reasoning quality than standard graph metrics, suggesting that effective reasoning is better captured by higher-dimensional geometric structures rather than purely relational graphs. We further show that a compact, stable set of topological features reliably indicates trace quality, offering a practical signal for future reinforcement learning algorithms.
| Comments: | Accepted in ICML 2026 Workshop: Epistemic Intelligence in Machine Learning |
| Subjects: | Artificial Intelligence (cs.AI) |
| Cite as: | arXiv:2510.20665 [cs.AI] |
| (or arXiv:2510.20665v2 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2510.20665 arXiv-issued DOI via DataCite |
From: Xue Wen Tan Dr [view email]
[v1]
Thu, 23 Oct 2025 15:43:43 UTC (15,472 KB)
[v2]
Thu, 21 May 2026 14:18:00 UTC (15,387 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。