





















Abstract:Evaluating true metacognition in Large Language Models (LLMs) is difficult due to biases and heuristics. This paper presents a framework to measure and enhance LLM metacognition while controlling for these biases. A measurement method using the $d'_{\rm type2}$ metric is established to isolate metacognitive ability. The Evolution Strategy for Metacognitive Alignment (ESMA) is proposed, demonstrating robust generalization across unseen datasets, languages, and newly acquired knowledge. Finally, parameter analysis reveals that these improvements are driven by a sparse set of parameters, offering new pathways for targeted metacognitive optimization.
| Comments: | Preprint |
| Subjects: | Neural and Evolutionary Computing (cs.NE); Artificial Intelligence (cs.AI); Computation and Language (cs.CL); Neurons and Cognition (q-bio.NC) |
| Cite as: | arXiv:2602.02605 [cs.NE] |
| (or arXiv:2602.02605v2 [cs.NE] for this version) | |
| https://doi.org/10.48550/arXiv.2602.02605 arXiv-issued DOI via DataCite |
From: Sangjun Park [view email]
[v1]
Mon, 2 Feb 2026 04:08:13 UTC (1,441 KB)
[v2]
Sun, 24 May 2026 06:55:56 UTC (1,449 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。