





















We present an algorithm for computing $F_p$, the $p$th moment of an $n$-dimensional frequency vector of a data stream, for $2 < p < \log (n) $, to within $1\pm ε$ factors, $ε\in [n^{-1/p},1]$ with high constant probability. Let $m$ be the number of stream records and $M$ be the largest magnitude of a stream update. The algorithm uses space in bits $$ O(p^2ε^{-2}n^{1-2/p}E(p,n) \log (n) \log (nmM)/\min(\log (n),ε^{4/p-2}))$$ where, $E(p,n) = (1-2/p)^{-1}(1-n^{-4(1-2/p})$. Here $E(p,n)$ is $ O(1)$ for $p = 2+Ω(1)$ and $ O(\log n)$ for $p = 2 + O(1/\log (n)$. This improves upon the space required by current algorithms \cite{iw:stoc05,bgks:soda06,ako:arxiv10,bo:arxiv10} by a factor of at least $Ω(ε^{-4/p} \min(\log (n), ε^{4/p-2}))$. The update time is $O(\log (n))$. We use a new technique for designing estimators for functions of the form $ψ(\expect{X})$, where, $X$ is a random variable and $ψ$ is a smooth function, based on a low-degree Taylor polynomial expansion of $ψ(\expect{X})$ around an estimate of $\expect{X}$.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。