
























Abstract:Large Language Models (LLMs) are unable to reliably reason about specific physical systems. Attempts to imbue LLMs with knowledge of the necessary physics concepts have shown great promise, but explainability and validation remain open challenges. An emerging alternative is tooling, where LLMs can query physical simulators and use the resulting simulation traces as context for validation. This approach suffers from poor scalability since simulation traces contain large volumes of fine-grained numerical and semantic data. We show that translating simulation traces to a sparse representation of "high-level" structural patterns leads to more effective interpretation by LLMs. We propose an unsupervised learning scheme to perform this translation, or annotation, via program synthesis. Our learning results in a library of programs that act as pattern detectors which can translate simulation traces to sparse, annotated pattern sequences. The detected patterns may optionally be guided by human experts via string labels (rigid collision, stretching spring, etc.). We show, using a recent physics benchmark, that such annotated representations are more amenable to natural language reasoning about specific physical systems. The synthesized programs serve as transparent, explainable functions that map system states to a sparse and efficient annotation space. As an example application, we show how goals within physical systems that are specified in natural language may be converted to reward programs which are maximized to find solutions.
| Subjects: | Artificial Intelligence (cs.AI); Human-Computer Interaction (cs.HC) |
| ACM classes: | I.2.1; I.2.7 |
| Cite as: | arXiv:2602.10009 [cs.AI] |
| (or arXiv:2602.10009v2 [cs.AI] for this version) | |
| https://doi.org/10.48550/arXiv.2602.10009 arXiv-issued DOI via DataCite |
From: Sean Memery [view email]
[v1]
Tue, 10 Feb 2026 17:31:39 UTC (2,114 KB)
[v2]
Thu, 21 May 2026 13:15:29 UTC (23,601 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。