



























The minimax sample complexity of group distributionally robust optimization (GDRO) has been determined up to a $\log(K)$ factor, where $K$ is the number of groups. In this work, we venture beyond the minimax perspective via a novel notion of sparsity that we dub $(λ, β)$-sparsity. In short, this condition means that at any parameter $θ$, there is a set of at most $β$ groups whose risks at $θ$ all are at least $λ$ larger than the risks of the other groups. To find an $ε$-optimal $θ$, we show via a novel algorithm and analysis that the $ε$-dependent term in the sample complexity can swap a linear dependence on $K$ for a linear dependence on the potentially much smaller $β$. This improvement leverages recent progress in sleeping bandits, showing a fundamental connection between the two-player zero-sum game optimization framework for GDRO and per-action regret bounds in sleeping bandits. We next show an adaptive algorithm which, up to log factors, gets a sample complexity bound that adapts to the best $(λ, β)$-sparsity condition that holds. We also show how to get a dimension-free semi-adaptive sample complexity bound with a computationally efficient method. Finally, we demonstrate the practicality of the $(λ, β)$-sparsity condition and the improved sample efficiency of our algorithms on both synthetic and real-life datasets.
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。