
























Abstract:Characterizing precisely the asymptotic generalization error of neural networks using parameters that can be estimated efficiently is a crucial problem in machine learning, which relies heavily on heuristics and practitioners' intuition to make key design choices. In order to mitigate this issue, we introduce the Representation Gap, a metric closely related to the generalization error, but admitting better-behaved asymptotic dynamics. Focusing on equivariant diffusion models and leveraging results from optimal quantization and point-process theory, we derive a precise asymptotic equivalent of the Representation Gap and show that it is governed by a single parameter, the \textit{intrinsic dimension} of the task, which is easy to interpret, efficient to estimate, and can be linked to the equivariances of common neural network architectures. We show that this asymptotic dynamic also extends to a broader range of tasks and training algorithms. Finally, we demonstrate empirically that our asymptotic law and intrinsic dimension estimation are accurate on a wide range of synthetic datasets, where these quantities are known, as well as on more realistic datasets, where we obtain results consistent with the related literature.
| Subjects: | Machine Learning (cs.LG); Machine Learning (stat.ML) |
| Cite as: | arXiv:2605.21692 [cs.LG] |
| (or arXiv:2605.21692v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.21692 arXiv-issued DOI via DataCite (pending registration) |
From: David Perera [view email]
[v1]
Wed, 20 May 2026 19:51:25 UTC (4,357 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。