





















Abstract:Across many risk-sensitive areas, it is critical to continuously audit machine learning systems as we receive more data to quickly determine if they are performing as designed. This auditing task can be modeled as a sequential hypothesis testing problem with $k$ data streams and a global null hypothesis that asserts the system operates as intended across all $k$ streams. Under the alternative, the standard global sequential test, which uses a Bonferroni correction, has an expected stopping time of $O\left(\ln \frac{k}{\alpha}\right)$ for large $k$ and significance level $\alpha$. In this work, we demonstrate that efficient sequential tests, relying on merging martingales via averaging and products rules, provide improved stopping times, and thus more powerful tests against the null. Using these results, we show that a balanced test can match the Bonferroni rate of $O\left(\ln \frac{k}{\alpha}\right)$ in the sparse regime (just a few non-null streams) while achieving $O\left(\frac{1}{k}\ln \frac{1}{\alpha}\right)$ under dense alternatives (many non-null steams). We validate our theory through experiments on both synthetic and real-world data.
| Subjects: | Machine Learning (stat.ML); Machine Learning (cs.LG) |
| Cite as: | arXiv:2602.21479 [stat.ML] |
| (or arXiv:2602.21479v2 [stat.ML] for this version) | |
| https://doi.org/10.48550/arXiv.2602.21479 arXiv-issued DOI via DataCite |
From: Beepul Bharti [view email]
[v1]
Wed, 25 Feb 2026 01:10:45 UTC (3,125 KB)
[v2]
Sun, 24 May 2026 00:31:34 UTC (3,096 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。