























Abstract:Sequential Bayesian experimental design is often formulated as a fixed-horizon policy optimization problem, in which the number of experiments is specified before data collection begins. In practical campaigns, however, additional measurements may provide diminishing information relative to their cost, making termination an integral part of experimental design. Common threshold-based stopping rules are easy to implement but myopic, because they compare the current state with a fixed criterion rather than the expected value of future experiments. This work develops a Bayesian optimal stopping framework for sequential experimental design by treating design and stopping as coupled decisions in a finite-horizon sequential decision problem. We prove that, for any fixed design policy, the optimal stopping rule terminates when the immediate terminal reward is no smaller than the expected continuation value. We then derive a policy-gradient method for learning continuous design policies with value-based stopping. The resulting optimization is challenging because the design policy, continuation value, and stopping boundary are mutually dependent, and naïve training can become trapped in early-stopping local optima. To address this difficulty, we introduce a curriculum strategy that gradually transitions from forced continuation to adaptive stopping during training. Numerical studies on a linear-Gaussian benchmark, a nonlinear test case, and a contaminant source detection problem show that the proposed approach learns stable, resource-aware design-stopping policies, with the largest gains in settings with strong sequential dependence.
From: Chen Cheng [view email]
[v1]
Fri, 26 Sep 2025 01:02:24 UTC (308 KB)
[v2]
Thu, 28 May 2026 01:10:56 UTC (1,691 KB)
[v3]
Sat, 13 Jun 2026 02:10:15 UTC (1,702 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。