
























Abstract:Discrete probability laws underpin statistical modeling, yet the catalog of interpretable distributions has expanded only gradually through centuries of case-by-case mathematical derivations. We introduce symbolic density estimation (SDE), an unsupervised framework that automatically recovers closed-form probability mass functions by composing elementary analytic operations within a structured search space. Our method integrates domain-specific structural priors with evolutionary search and a validity-aware inference stage, and it extends to richer distribution families such as zero inflation and finite mixtures. To support systematic evaluation and future research, we contribute a benchmark dataset spanning a broad collection of commonly used discrete distributions. The proposed algorithm recovers all benchmark families with accurate parameter estimates. A real data application shows that it identifies concise and interpretable mixture models that improve goodness-of-fit over standard models.
| Comments: | 28 pages, 5 figures, 22 tables |
| Subjects: | Machine Learning (cs.LG) |
| Cite as: | arXiv:2605.21813 [cs.LG] |
| (or arXiv:2605.21813v1 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2605.21813 arXiv-issued DOI via DataCite (pending registration) |
From: Ziwen Liu [view email]
[v1]
Wed, 20 May 2026 23:22:21 UTC (722 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。