























Abstract:We develop a scan statistic method for detecting local clusters in a two-sample nonhomogeneous Poisson process (NHPP) framework, motivated by copy number variation (CNV) analysis in next-generation sequencing data. The control sample is used to construct an empirical time transformation, under which the transformed case sample is approximately uniform on [0,1] under the null hypothesis. The scan statistic is defined as the maximum number of transformed points within a moving window.
We show that the scan statistic converges to a generalized extreme value (GEV) distribution with an extremal index that captures the dependence induced by overlapping windows. The GEV parameters and extremal index are estimated using maximum likelihood and exceedance clustering methods, providing an asymptotic calibration of the test. A permutation procedure is also developed to provide a nonparametric alternative. Simulation studies show that the permutation calibration maintains empirical Type I error close to the nominal level across the considered settings, and the GEV calibration is accurate for smaller windows. Both proposed procedures show competitive power compared with the continuous testing method under heterogeneous baseline intensities. An application to sequencing data illustrates the effectiveness of the proposed approach for detecting CNV regions.
From: Tung-Lung Wu [view email]
[v1]
Thu, 11 Jun 2026 23:27:36 UTC (2,745 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。