






















Abstract:We consider the problem of finding $\delta$-stationary points for $\delta > 0$, i.e., $x \in \mathbb{R}^d$ such that $||\nabla F(x)|| \le \delta$, for smooth, non-convex objectives, where the derivative oracles are not only stochastic but also biased. In the first-order setting, we provide tight lower bounds for finding an $O((\epsilon + B^2)^{1/2})$-stationary point, for $\epsilon > 0$ and where $B$ is a bound on the gradient bias, matching the upper bounds of Ajalloeian and Stich (2020). We then establish bias-dependent lower bounds for algorithms that use higher-order derivative information for finding $O(\epsilon + B_{\max})$-stationary points, where $B_{\max}$ is a bound on the maximum bias for all derivatives. To complement these lower bounds, we develop trust-region based methods that, for certain ranges of bias, provide guarantees that match the corresponding lower bounds. We further improve upon the oracle complexity in high bias settings through a higher-order variance reduction scheme, in particular demonstrating the benefits, in some cases, of using higher-order derivative information, whereas such improvements are known to be unattainable for stochastic unbiased settings.
From: Anant Shyam [view email]
[v1]
Wed, 17 Jun 2026 19:51:08 UTC (31 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。