
























Abstract:We study the Riemannian geometry of the Deep Linear Network (DLN) as a foundation for a thermodynamic description of the learning process. The main tools are the use of group actions to analyze overparametrization and the use of Riemannian submersion from the space of parameters to the space of observables. The foliation of the balanced manifold in the parameter space by group orbits is used to define and compute a Boltzmann entropy. We also show that the Riemannian geometry on the space of observables defined in [2] is obtained by Riemannian submersion of the balanced manifold. The main technical step is an explicit construction of an orthonormal basis for the tangent space of the balanced manifold using the theory of Jacobi matrices.
| Comments: | Final version of accepted paper in SIAM Journal on Mathematical Analysis. Includes fixes of minor typos (especially equation (3.13), (6.35) and (6.36) |
| Subjects: | Machine Learning (cs.LG); Differential Geometry (math.DG); Dynamical Systems (math.DS) |
| MSC classes: | 68T07, 49J40, 94A17, 60B20 |
| Cite as: | arXiv:2509.09088 [cs.LG] |
| (or arXiv:2509.09088v3 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2509.09088 arXiv-issued DOI via DataCite |
From: Govind Menon [view email]
[v1]
Thu, 11 Sep 2025 01:40:46 UTC (36 KB)
[v2]
Fri, 2 Jan 2026 21:48:47 UTC (35 KB)
[v3]
Wed, 20 May 2026 19:22:45 UTC (35 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。