惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

stat updates on arXiv.org

Spiking the training data to correct for test set contamination Optimal Non-Asymptotic Edgeworth Expansions for Multivariate Neural Network Outputs Causality as the Statistical Conscience of Artificial Intelligence: From Pearl's Ladder to Trustworthy Machines Detecting Metastable Basins in High Dimensions via Marginal Trajectory Distribution Discrimination Distributionally Robust Transfer Learning with Structurally Missing Covariates, with Application to Cross-National Cardiac Arrest Prediction MEDAL: Manifold Embedding Distillation via Autoencoder Learning Multicalibration Boosting: Theory, Convergence, and Transferability Clustering based on Stochastic Dominance with application for risk averters and risk seekers Affinity Graph Connectivity in Convex Clustering On the Sample Complexity of Robust Binary Hypothesis Testing How Neural Reward Models Learn Features for Policy Optimization: A Single-Index Analysis Estimating Mixture Distributions via Stochastic Mirror Descent Multimodality Stacking with Blockwise missing values and application to the PIONeeR biomarkers study for prediction of resistance to immunotherapy Counterfactually Safe Reinforcement Learning Rejoinder: The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review Nyström Kernel Stein Discrepancy Tests Choosing Online Experiment Designs under Interference in Ads, Recommendations, and Member-Experience Systems Learning manifold diffusion semigroups from graph transition matrices Different Statistical Perspectives for Understanding Generalisation in Graph Neural Networks Mean-Shift PCA by Knockoff Mean Guided Flow Matching for Forward and Inverse PDE Problems with Sparse Observations: Algorithm and Theory Possession-Level Player Impact in the Pre-Play-by-Play NBA Era: A Video-Reconstructed RAPM Database, 1984--1996 Convergence and non-asymptotic error analysis for kinetic Langevin samplers using the exact harmonic Langevin integrator PCA score regression: the art of losing power Heritability: A Counterfactual Perspective Long Memory in Intrinsically Dynamic Factor Models Modified treatment policies that depend on the natural history of treatment Post-Processing Posterior Predictive P-values Scalable Gaussian Process for Learning Non-Ergodic Ground Motion Model from Physics-Based Simulations with Application to Power Infrastructure Assessment Using the target trial framework for combining information: external comparator analyses and other applications Trustworthy AI/ML Regression and Unbiased Causal Inference for Real-World Data Synthetic Heterogeneous-Effects LASSO: A Fixed-effects Estimation Approach for High-dimensional Mixed-effects Models Bayesian Conformal-Projective Prediction Shared hidden-factor information framework for multiple behavioral tasks Consistent Identification of Top-$K$ Nodes in Noisy Networks Adaptable High-Dimensional Change Point Detection via Ridge Regularization Logistic regression is not enough: The need for Bayesian nonparametric modelling for causal inference using observational data, exemplified by the 'gateway' effect Distributional Conformal Prediction for Markov Processes How Eviction Court Governs: A Statistical Analysis of Bargaining, Templates, and Debt in Philadelphia Deep Regression for Repeated Measurements under Covariate Shift Optimal Estimation of Discrete Multiview Distributions under Heteroskedastic Multinomial Sampling Shared Keyboard: An improved Bayesian design for phase I clinical trials via Beta kernel process Kernel Embedding for Operator-Valued Measures and Its Application to Quantum Tomography A Statistical Physics View of the S&P 500: Pairwise Interactions and Time-Varying Dynamics A Quasi Maximum Likelihood Estimation Method for Bergomi-Type Volatility Models Rank-Based Tests for Mutual Independence of High-Dimensional Random Vectors via $L_q$ Norm Transcripts and Algebraic Distances in Time Series: Stochastic Properties and Nonparametric Dependence Tests Estimation of Directed Acyclic Graphs by Frequentist Model Averaging HawkesLLM: Semantic Uncertainty Propagation in Agentic Text Simulation When Is Next-Token Prediction Useful? Marginalization, Ergodicity, Mixture Identifiability, Local Sufficiency, RAG, Tools, and Programming Approximate Machine Unlearning through Manifold Representation Forgetting Guided by Self Mode Connectivity Human-Centered Learning Mechanics: A Dynamical Framework for Entropy-Regulated Representation Learning Anytime Training with Schedule-Free Spectral Optimization Robust OT-Guided Generative Residual Domain Adaptation for Bike-Sharing Demand Prediction under Temporal Domain Shift Any-Dimensional Invariant Universality Understanding and Improving Noisy Embedding Techniques in Instruction Finetuning KAPLAN: Kolmogorov-Arnold Prognostic Learnable Activation Networks for Survival Analysis Instance-Optimal Estimation with Multiple LLM Judges on a Budget Optimal Dimension-Free Sampling for Regularized Classification Entrywise Error Bounds for Spectral Ranking with Semi-Random Adversaries Training-Free Looped Transformers Diffusion-based Denoising Beats Vanilla Score Matching in Parameter Estimation: A Theoretical Explanation Uncertainty-aware classification and triage of structural heart disease using electrocardiography and echocardiography metrics LLM Sparsity Prior for Robust Feature Selection Operationalizing Individual Fairness via Gradient Descent and Bradley-Terry Models Entropy Equivalence Testing Coupled Training with Privileged Information and Unlabeled Data Asymmetric Scaling Laws from Sparse Features Dirichlet-Based Monte Carlo Dropout for Uncertainty Estimation in Neural Networks Learning Kernel-Based MDPs from Episodic Preferential Feedback Move on Muon : A Hamiltonian probability gradient flow perspective of Muon optimizer On the Stability of Spherical Hellinger-Kantorovich Flows and Their Implications for Differential Privacy Causal Additive Models with Unobserved Causal Paths and Backdoor Paths Are Targeted Data Poisoning Attacks as Effective as We Think? Near-Optimal Private Linear Regression via Iterative Hessian Mixing Certified Per-Instance Unlearning Using Individual Sensitivity Bounds Vecchia-Inducing-Points Full-Scale Approximations for Gaussian Processes Decomposition-Based Modular Conformal Prediction for Two-Stage Modeling Online Partitioned Local Depth for semi-supervised applications Amortized Simulation-Based Inference in Generalized Bayes via Neural Posterior Estimation Online monotone density estimation and log-optimal calibration Linear Regression with Unknown Truncation Beyond Gaussian Features Order-Optimal Sequential 1-Bit Mean Estimation in General Tail Regimes Sample correlation adjustments for robust Multi-fidelity Monte Carlo under limited pilot sampling Mixture-of-Finite-Mixtures Wishart Model for Clustering Covariance Matrices with an Application to Brain Functional Connectivity A Direct Variance Estimation (DiVE) for Meta-Analysis of Median Differences Regulatory Considerations for Using Artificial Intelligence Models to Reduce Sample Sizes in Registrational Studies Generalized Rank Regression Generalized Stochastic Approximation of the Log-Likelihood Ratio for Robust Sequential Change-Point Detection Concomitant DAG Learning: On the Roles of Noise Adaptivity, Sparsity, and Non-negativity The frame problem in quantitative practice: ontological uncertainty and epistemic humility in an age of automated inference Directional subset simulation method for reliability analysis A note on closed-form solutions for estimating sample size when externally validating a binary prediction model based on $C$-statistic precision Joint Estimation of Marginal and Heterogeneous Treatment Effects Trajectory-Oriented Optimization Via Adaptive Thompson Sampling And Grid Refinement: A Tutorial With The ADAPTIVE\_TS Package Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics Joint Bayesian models for validating spatial health-event databases against a gold standard: separating global and local discrepancies Anticipating Continued Global Fertility Decline via Neural Forecasting Detecting and Correcting Sample-by-Sample Scale Distortion in RNA Sequencing Data StanBKT: Rethinking Parameter Estimation in Bayesian Knowledge Tracing
Information-Theoretic Reliability is Robust to Analytic Choice: A 24-Specification Multiverse on Public Cognitive Test-Retest Data
Maria Westri · 2026-05-26 · via stat updates on arXiv.org

View PDF HTML (experimental)

Abstract:Background. The reliability paradox describes the empirical observation that cognitive tasks producing robust group-level effects often yield poor between-individual reliability. Existing approaches rely predominantly on the intraclass correlation coefficient (ICC), which captures only linear, second-moment dependence between test and retest.
Methods. We introduce a normalized, information-theoretic complement to ICC, NLR{\Delta}, defined as the difference between empirically estimated mutual information and the analytic Gaussian baseline implied by the test-retest correlation. We pair NLR{\Delta} with ICC(2,1), bias-corrected and accelerated (BCa) bootstrap intervals, Benjamini-Hochberg false discovery rate (FDR) control, and a 24-cell multiverse over the KSG nearest-neighbour parameter, correlation method, and minimum-sample threshold. The full pipeline is governed by pre-specified claim contracts, content-addressed provenance, and SHA-256-verified raw data ingestion, and is released as the MixMind Reliability Framework.
Results. Across 50 estimable primary measures from the Flanker, Stroop, Stop-Signal, Go/No-Go, and Posner task families, the median NLR{\Delta} is -0.138 nats, with interquartile range [-0.257, -0.034]. Zero of 50 primary measures exceed the headline rule. The companion ICC(2,1) analysis recovers the classical reliability paradox pattern, and the 24-specification multiverse yields 0 of 1,200 estimable cells passing the headline rule.
Conclusions. On these two public datasets, replacing or augmenting ICC with an information-theoretic reliability measure does not rescue cognitive tasks from the reliability paradox. The robust null is invariant to the analytic choices examined here. We release the full pipeline, raw-data hashes, and contracts to enable exact replication and extension to other datasets and tasks.
Comments: 12 pages, 2 figures, 3 tables; software and reproducibility materials archived at Zenodo DOI https://doi.org/10.5281/zenodo.20207371
Subjects: Methodology (stat.ME)
Cite as: arXiv:2605.24995 [stat.ME]
  (or arXiv:2605.24995v1 [stat.ME] for this version)
  https://doi.org/10.48550/arXiv.2605.24995

arXiv-issued DOI via DataCite (pending registration)

Submission history

From: Maria Westrin [view email]
[v1] Sun, 24 May 2026 10:48:31 UTC (143 KB)