惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
cs.AI updates on arXiv.org
cs.AI updates on arXiv.org
Hacker News - Newest:
Hacker News - Newest: "LLM"
S
Security Affairs
PCI Perspectives
PCI Perspectives
Google Online Security Blog
Google Online Security Blog
W
WeLiveSecurity
www.infosecurity-magazine.com
www.infosecurity-magazine.com
Recent Commits to openclaw:main
Recent Commits to openclaw:main
P
Privacy & Cybersecurity Law Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
S
Security @ Cisco Blogs
Security Archives - TechRepublic
Security Archives - TechRepublic
Cyberwarzone
Cyberwarzone
L
Lohrmann on Cybersecurity
TaoSecurity Blog
TaoSecurity Blog
V
Visual Studio Blog
博客园 - 聂微东
Scott Helme
Scott Helme
博客园 - 【当耐特】
K
Kaspersky official blog
Security Latest
Security Latest
K
KPMG report finds enterprise disconnect between AI and its ROI | CIO
MyScale Blog
MyScale Blog
Schneier on Security
Schneier on Security
WordPress大学
WordPress大学
博客园 - 叶小钗
C
Check Point Blog
V2EX - 技术
V2EX - 技术
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
博客园 - Franky
T
Tor Project blog
Apple Machine Learning Research
Apple Machine Learning Research
让小产品的独立变现更简单 - ezindie.com
让小产品的独立变现更简单 - ezindie.com
腾讯CDC
雷峰网
雷峰网
博客园_首页
美团技术团队
Y
Y Combinator Blog
C
CERT Recently Published Vulnerability Notes
AWS News Blog
AWS News Blog
月光博客
月光博客
N
Netflix TechBlog - Medium
Last Week in AI
Last Week in AI
Recent Announcements
Recent Announcements
Google DeepMind News
Google DeepMind News
Help Net Security
Help Net Security
P
Proofpoint News Feed
MongoDB | Blog
MongoDB | Blog
C
Cybersecurity and Infrastructure Security Agency CISA

cs.DS updates on arXiv.org

PAC Learning with Bandit Feedback: Sharp Sample Complexity in the Realizable Setting Algorithms with Polynomially-Improved Approximation Factors for the $2 \rightarrow q$ Norm, and Applications A computational phase transition for learning-to-sample from Ising models Covering vertices by sequential stars Fermi-Dirac machines as quantizations of neurons A Comprehensive Evaluation of Vertex Elimination Algorithms for Algorithmic Differentiation A Tight Bound on Localization of Electrical Flows Optimal Dimension-Free Sampling for Regularized Classification Reducing the Randomness in Partition Oracles for Bounded Degree Minor-Free Graphs Beyond the Half-Approximation: Fair and Efficient Online Class Matching Efficient Uniform Sampling of Surjections via their Profiles Tractable Maximization of Budgeted Phylogenetic Diversity on Networks Utilizing Node Scanwidth Fairness in Aggregation: Optimal Top-$k$ and Improved Full Ranking Learning-Augmented Online Scheduling with Parsimonious Preemption Entropy Equivalence Testing Lumberjack: Better Differentially Private Random Forests through Heavy Hitter Detection in Trees The Secretary Problem with a Stochastic Precursor Polynomial-Time Robust Multiclass Linear Classification under Gaussian Marginals Efficient Banzhaf-Based Data Valuation for $k$-Nearest Neighbors Classification Block-Sphere Vector Quantization An Approximation Algorithm for Graph Label Selection Iterative Chow Filtering for Learning with Distribution Shift Complexity of Non-Log-Concave Sampling in Fisher Information Stochastic Matching via Local Sparsification Finite Sample Bounds for Learning with Score Matching What is Learnable in Valiant's Theory of the Learnable? Provable Quantization with Randomized Hadamard Transform Min-Max Optimization Requires Exponentially Many Queries Fast and Compact Graph Cuts for the Boykov-Kolmogorov Algorithm A proximal gradient algorithm for composite log-concave sampling Adaptive Multi-Round Allocation with Stochastic Arrivals The tractability landscape of diffusion alignment: regularization, rewards, and computational primitives Mistake-Bounded Language Generation Positional LSH: Binary Block Matrix Approximation for Attention with Linear Biases Learning-Augmented Scalable Linear Assignment Problem Optimization via Neural Dual Warm-Starts A Note on Non-Negative $L_1$-Approximating Polynomials Curvature Beyond Positivity: Greedy Guarantees for Arbitrary Submodular Functions Convex Optimization with Nested Evolving Feasible Sets On the Complexity of the Matching Problem of Regular Expressions with Backreferences Simple KNN-Based Outlier Detection Achieves Robust Clustering Online Allocation with Unknown Shared Supply Equivalence of Coarse and Fine-Grained Models for Learning with Distribution Shift Accelerated Relax-and-Round for Concave Coverage Problems Contrastive Identification and Generation in the Limit Quantizing With Randomized Hadamard Transforms: Efficient Heuristic Now Proven Nearly Optimal Attention Coresets On Computing Total Variation Distance Between Mixtures of Product Distributions Exact and Approximate Algorithms for Polytree Learning Provable Accuracy Collapse in Embedding-Based Representations under Dimensionality Mismatch New Bounds for Kernel Sums via Fast Spherical Embeddings Unlearning Offline Stochastic Multi-Armed Bandits Matroid Algorithms Under Size-Sensitive Independence Oracles On the Learning Curves of Revenue Maximization Asymptotically Robust Learning-Augmented Algorithms for Preemptive FIFO Buffer Management Flashback: A Reversible Bilateral Run-Peeling Decomposition of Strings Incremental Strongly Connected Components with Predictions Characterizing Admissible Objective Functions for Hierarchical Clustering Well-Conditioned Oblivious Perturbations in Linear Space Mathematical Foundations for Peer-to-Peer Lattice Computation Graph Neural Network-Informed Predictive Flows for Faster Ford-Fulkerson and PAC-Learnability A weighted angle distance on strings Towards Universal Convergence of Backward Error in Linear System Solvers Constant-Factor Approximations for Doubly Constrained Fair k-Center, k-Median and k-Means Tight Bounds for Learning Polyhedra with a Margin Efficiency of Proportional Mechanisms in Online Auto-Bidding Advertising Skyline-First Traversal as a Control Mechanism for Multi-Criteria Graph Search Constant-Factor Approximation for the Uniform Decision Tree Limited Perfect Monotonical Surrogates constructed using low-cost recursive linkage discovery with guaranteed output Query Lower Bounds for Diffusion Sampling Early Pruning for Public Transport Routing Adapting Dijkstra for Buffers and Unlimited Transfers Exploiting Low-Rank Structure in Max-K-Cut Problems Partial Optimality in the Preordering Problem High-accuracy log-concave sampling with stochastic queries Learning to Approximate Uniform Facility Location via Graph Neural Networks Linear Regression with Unknown Truncation Beyond Gaussian Features Adaptive Power Iteration Method for Differentially Private PCA Finite and Corruption-Robust Regret Bounds in Online Inverse Linear Optimization under M-Convex Action Sets Learning Mixture Models via Efficient High-dimensional Sparse Fourier Transforms Variance Computation for Weighted Model Counting with Knowledge Compilation Approach Deterministic Coreset for Lp Subspace Online Algorithms for Repeated Optimal Stopping: Balancing Baseline Guarantees and Regret Learned Static Function Data Structures Optimal hypersurface decision trees A Perfectly Truthful Calibration Measure The Geometry of LLM Quantization: GPTQ as Babai's Nearest Plane Algorithm Best Agent Identification for General Game Playing A Faster Generalized Two-Stage Approximate Top-K Fast and Simple Densest Subgraph with Predictions Smoothed Analysis of Learning from Positive Samples Ineffectiveness for Search and Undecidability of PCSP Meta-Problems Sample-Efficient Optimization over Generative Priors via Coarse Learnability Efficient distributional regression trees learning algorithms for calibrated non-parametric probabilistic forecasts Testing Noise Assumptions of Learning Algorithms Testing Support Size More Efficiently Than Learning Histograms Sharper Bounds for Chebyshev Moment Matching, with Applications Expander Hierarchies for Normalized Cuts on Graphs Multilayer Correlation Clustering Efficient Parameter Estimation of Truncated Boolean Product Distributions Faster Hamiltonian Monte Carlo by Learning Leapfrog Scale: a self-calibrated randomized solution
Quicksort, Largest Bucket, and Min-Wise Hashing with Limited Independence
Mathias Bæk Tejs Knudsen, Morten Stöckel · 2015-02-20 · via cs.DS updates on arXiv.org

Randomized algorithms and data structures are often analyzed under the assumption of access to a perfect source of randomness. The most fundamental metric used to measure how "random" a hash function or a random number generator is, is its independence: a sequence of random variables is said to be $k$-independent if every variable is uniform and every size $k$ subset is independent. In this paper we consider three classic algorithms under limited independence. We provide new bounds for randomized quicksort, min-wise hashing and largest bucket size under limited independence. Our results can be summarized as follows. -Randomized quicksort. When pivot elements are computed using a $5$-independent hash function, Karloff and Raghavan, J.ACM'93 showed $O ( n \log n)$ expected worst-case running time for a special version of quicksort. We improve upon this, showing that the same running time is achieved with only $4$-independence. -Min-wise hashing. For a set $A$, consider the probability of a particular element being mapped to the smallest hash value. It is known that $5$-independence implies the optimal probability $O (1 /n)$. Broder et al., STOC'98 showed that $2$-independence implies it is $O(1 / \sqrt{|A|})$. We show a matching lower bound as well as new tight bounds for $3$- and $4$-independent hash functions. -Largest bucket. We consider the case where $n$ balls are distributed to $n$ buckets using a $k$-independent hash function and analyze the largest bucket size. Alon et. al, STOC'97 showed that there exists a $2$-independent hash function implying a bucket of size $Ω( n^{1/2})$. We generalize the bound, providing a $k$-independent family of functions that imply size $Ω( n^{1/k})$.