




















Abstract:Clustering in a big data setting is an intensively studied problem, with coresets emerging as one of the important paradigms in this line of work. Given a cost function $\text{cost}(P,S)$ mapping input points $P$ and a solution $S$ to an objective value, a coreset is a typically weighted sketch $\Omega\subseteq P$ such that $\text{cost}(\Omega,S)\approx \text{cost}(P,S)$. In practice, coreset sizes much smaller than those suggested by theoretical guarantees are often found to be sufficient.
In this paper, we offer an explanation for this phenomenon. Smaller coreset sizes suffice if we only wish to preserve the costs of \emph{good} solutions, i.e., solutions with low cost. We define and devise \emph{approximation-preserving coresets}, which provide a weaker guarantee than strong coresets, which apply to all solutions, while providing stronger guarantees than weak coresets, which apply only to the optimum solution. We complement this result by showing that even a very small distortion in the approximation factor cannot admit coresets of this size.
From: Sudarshan Shyam [view email]
[v1]
Mon, 15 Jun 2026 21:48:03 UTC (412 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。