




















Abstract:This paper considers sequential gray-box optimization where the objective function is given as the composition of a loss function and a parametric model. Crucially, the parameters of the model are unknown and need to be iteratively estimated from noisy observations of the model outputs. This problem setup generalizes the parametric black-box optimization problem known as (contextual) stochastic linear bandit. To address the sequential gray-box optimization problem, we propose a structure-exploiting method that leverages known problem structure given in terms of the loss function and an a priori set of admissible parameters. The method is based on the principle of optimism in the face of uncertainty and trades off exploration and exploitation by minimizing a lower confidence bound on the true objective function. We provide a detailed regret analysis of the novel method, improving on state-of-the-art results for the special case of linear stochastic bandits due to the use of a recently published bound for the parameter confidence sets arising in multi-output linear least-squares estimation. Numerical examples illustrate the superior performance of structure-exploiting methods compared to structure-agnostic approaches.
From: Katrin Baumgärtner [view email]
[v1]
Tue, 16 Jun 2026 09:41:41 UTC (691 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。