惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

H
Help Net Security
T
ThreatConnect
SecWiki News
SecWiki News
F
Future of Privacy Forum
AWS News Blog
AWS News Blog
C
Cisco Blogs
A
Arctic Wolf
Vercel News
Vercel News
The GitHub Blog
The GitHub Blog
Scott Helme
Scott Helme
V
V2EX
博客园 - 叶小钗
阮一峰的网络日志
阮一峰的网络日志
K
Kaspersky official blog
G
Google Developers Blog
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
P
Privacy International News Feed
C
Cyber Attacks, Cyber Crime and Cyber Security
N
News | PayPal Newsroom
Schneier on Security
Schneier on Security
NISL@THU
NISL@THU
Microsoft Azure Blog
Microsoft Azure Blog
量子位
The Hacker News
The Hacker News
Stack Overflow Blog
Stack Overflow Blog
Security Latest
Security Latest
M
Microsoft Research Blog - Microsoft Research
Google Online Security Blog
Google Online Security Blog
博客园_首页
C
CXSECURITY Database RSS Feed - CXSecurity.com
I
InfoQ
Google DeepMind News
Google DeepMind News
Y
Y Combinator Blog
The Cloudflare Blog
Microsoft Security Blog
Microsoft Security Blog
Martin Fowler
Martin Fowler
Cisco Talos Blog
Cisco Talos Blog
钛媒体:引领未来商业与生活新知
钛媒体:引领未来商业与生活新知
T
Troy Hunt's Blog
F
Fox-IT International blog
S
Security @ Cisco Blogs
博客园 - 司徒正美
cs.CV updates on arXiv.org
cs.CV updates on arXiv.org
C
Comments on: Blog
Threat Intelligence Blog | Flashpoint
Threat Intelligence Blog | Flashpoint
L
LINUX DO - 最新话题
GbyAI
GbyAI
Project Zero
Project Zero
腾讯CDC
T
Tailwind CSS Blog

stat updates on arXiv.org

Rank-Based Tests for Mutual Independence of High-Dimensional Random Vectors via $L_q$ Norm Detecting Metastable Basins in High Dimensions via Marginal Trajectory Distribution Discrimination Distributionally Robust Transfer Learning with Structurally Missing Covariates, with Application to Cross-National Cardiac Arrest Prediction MEDAL: Manifold Embedding Distillation via Autoencoder Learning Multicalibration Boosting: Theory, Convergence, and Transferability Clustering based on Stochastic Dominance with application for risk averters and risk seekers Physen-Noise2Noise: Physics-Guided Self-Supervised Defocus Deblurring with Bias Correction under Low-Light Conditions Affinity Graph Connectivity in Convex Clustering On the Sample Complexity of Robust Binary Hypothesis Testing How Neural Reward Models Learn Features for Policy Optimization: A Single-Index Analysis Estimating Mixture Distributions via Stochastic Mirror Descent Multimodality Stacking with Blockwise missing values and application to the PIONeeR biomarkers study for prediction of resistance to immunotherapy Counterfactually Safe Reinforcement Learning Rejoinder: The ICML 2023 Ranking Experiment: Examining Author Self-Assessment in ML/AI Peer Review Nyström Kernel Stein Discrepancy Tests Choosing Online Experiment Designs under Interference in Ads, Recommendations, and Member-Experience Systems Algorithms with Polynomially-Improved Approximation Factors for the $2 \rightarrow q$ Norm, and Applications Learning manifold diffusion semigroups from graph transition matrices Different Statistical Perspectives for Understanding Generalisation in Graph Neural Networks Mean-Shift PCA by Knockoff Mean Guided Flow Matching for Forward and Inverse PDE Problems with Sparse Observations: Algorithm and Theory From DPPs to $k$-DPPs: identifiability analysis via spectral decomposition Rao-Blackwellized Score Matching on Manifolds Nonstationary Generalized Linear Bandits with Discounted Online Mirror Descent Optimal Design for Multinomial Logit Model with Applications to Best Assortment Identification Learning Sparse Compositional Functions with Norm-Constrained Neural Networks StrTransformer: Source-Wise Structured Transformers for Unsupervised Blind Source Recovery PAC Learning with Bandit Feedback: Sharp Sample Complexity in the Realizable Setting Geometry Adaptive Counterfactual Distribution Learning with Diffusion-Guided Smoothing Minimax Limits of k-Fold Cross-Validation via Majority Statistical Inference for Stochastic Gradient Descent Beyond Finite Variance DiscoverPhysics: Benchmarking LLMs for Out-of-the-Box Scientific Thinking Possession-Level Player Impact in the Pre-Play-by-Play NBA Era: A Video-Reconstructed RAPM Database, 1984--1996 Convergence and non-asymptotic error analysis for kinetic Langevin samplers using the exact harmonic Langevin integrator PCA score regression: the art of losing power Heritability: A Counterfactual Perspective GIBLy: Improving 3D Semantic Segmentation through an Architecture-Agnostic Lightweight Geometric Inductive Bias Layer Long Memory in Intrinsically Dynamic Factor Models Modified treatment policies that depend on the natural history of treatment Post-Processing Posterior Predictive P-values Scalable Gaussian Process for Learning Non-Ergodic Ground Motion Model from Physics-Based Simulations with Application to Power Infrastructure Assessment Using the target trial framework for combining information: external comparator analyses and other applications Trustworthy AI/ML Regression and Unbiased Causal Inference for Real-World Data Synthetic Heterogeneous-Effects LASSO: A Fixed-effects Estimation Approach for High-dimensional Mixed-effects Models Bayesian Conformal-Projective Prediction Shared hidden-factor information framework for multiple behavioral tasks Consistent Identification of Top-$K$ Nodes in Noisy Networks Adaptable High-Dimensional Change Point Detection via Ridge Regularization Logistic regression is not enough: The need for Bayesian nonparametric modelling for causal inference using observational data, exemplified by the 'gateway' effect Distributional Conformal Prediction for Markov Processes How Eviction Court Governs: A Statistical Analysis of Bargaining, Templates, and Debt in Philadelphia Deep Regression for Repeated Measurements under Covariate Shift Optimal Estimation of Discrete Multiview Distributions under Heteroskedastic Multinomial Sampling Information-Theoretic Reliability is Robust to Analytic Choice: A 24-Specification Multiverse on Public Cognitive Test-Retest Data Shared Keyboard: An improved Bayesian design for phase I clinical trials via Beta kernel process Kernel Embedding for Operator-Valued Measures and Its Application to Quantum Tomography A Statistical Physics View of the S&P 500: Pairwise Interactions and Time-Varying Dynamics A Quasi Maximum Likelihood Estimation Method for Bergomi-Type Volatility Models A multilevel sketch-and-solve method for overdetermined least squares problems Variance Inference Beyond the Sandwich for Asymptotically Linear Estimators with Second-Order Remainders Covariate-adjusted statistical dependence representation through partial copulas: bounds and new insights Global Sequential Testing for Multi-Stream Auditing Differentially Private Two-Stage Empirical Risk Minimization with Applications to Individualized Treatment Rule Correcting for Nonignorable Nonresponse Bias in Ordinal Observational Survey Data DiPPER: A Bayesian approach to differential prevalence analysis with applications in microbiome studies De-Linearizing Agent Traces: Bayesian Inference of Latent Partial Orders for Efficient Execution CROCS: A Two-Stage Clustering Framework for Behaviour-Centric Consumer Segmentation with Smart Meter Data Sparse covariate-driven factorization of high-dimensional brain connectivity with application to site effect correction Implicit geometric regularization in flow matching via density weighted Stein operators Selection-Induced Contraction of Innovation Statistics in Gated Kalman Filters Scalable Spatial Stream Network (S3N) Models Double Local-to-Unity: Inference under Nearly Nonstationary Volatility Gaussian Approximation for High-Dimensional Second-Order $U$- and $V$-statistics with Size-Dependent Kernels under i.n.i.d. Sampling Robust inference using density-powered Stein operators A Spectral Framework for Graph Neural Operators: Convergence Guarantees and Tradeoffs A robust and scalable estimation for high-dimensional volatility models A General Framework for Joint Multi-State Models One-shot Conditional Sampling: MMD meets Nearest Neighbors Some Robustness Properties of Label Cleaning Hybrid least squares for learning functions from highly noisy data Weight-calibrated estimation for factor models of high-dimensional time series A Functional Approach to Curve Alignment and Shape Analysis Re-examining Granger Causality with Causal Bayesian Networks and Reichenbachs Principles Selective Randomization Inference for Adaptive Experiments Preserving linear invariants in ensemble filtering methods Statistical methods for partitioning ribbon and globally-distributed flux using data from the Interstellar Boundary Explorer The Symmetric Location Problem: a Song of Efficiency and Robustness Matrix concentration inequalities for time-inhomogeneous Markov chains Quantile autoregressive moving average models for ratio-based bounded time series Considering causality in the construction of molecular signatures of lifestyle exposures Weighted NPMLE for the Marginal Mean of Recurrent Events with a Competing Terminal Event Nonparametric Estimation via Expected Order Statistics Bayesian perspectives on exponential random graph models High-Dimensional Change-Point Detection via Angular Kernel Statistics A Post-Processing Conformal Prediction Approach for Conditional Coverage via Pivotal Scores Confidence intervals for causal effects in sequential decision making Transcripts and Algebraic Distances in Time Series: Stochastic Properties and Nonparametric Dependence Tests Estimation of Directed Acyclic Graphs by Frequentist Model Averaging Exponential mixing properties of nonlinear functional autoregressive models A Statistical Framework for Model Selection in LSTM Networks
Why Agentic Theorem Prover Works: A Statistical Provability Theory of Mathematical Reasoning Models
Sho Sonoda, · 2026-05-26 · via stat updates on arXiv.org

View PDF HTML (experimental)

Abstract:Agentic theorem provers combine a reasoning model, retrieval, search, and a proof assistant verifier, yet it remains unclear which components actually improve finite-budget proof success and why they help on real mathematical workloads. We study this question through statistical provability: the probability of reaching a verified proof within a budget on a specified stream of theorem instances. We model formal proof search as a finite-horizon reachability MDP with deterministic verifier dynamics, and show that under a faithful state abstraction the optimal success probability coincides with ordinary syntactic provability. We then analyze a simple but practically important pipeline: depth-wise offline action-value regression followed by greedy test-time proving. Our main theorem bounds the provability gap between the learned prover and the optimal prover by an occupancy-weighted sum of uniform action-value errors; in the common uniform-error reading, the leading complexity multiplier is the learned prover's average truncated proof length. The error decomposes into approximation error, geometric coverage of the training distribution, and Monte Carlo label noise, and improves to a fast rate under an action-gap margin condition. The result gives a component-sensitive account of why verifier feedback, retrieval, representation geometry, and proof-shortening mechanisms help on biased theorem workloads, without contradicting classical worst-case hardness.
Comments: accepted at icml2026
Subjects: Machine Learning (stat.ML); Machine Learning (cs.LG)
Cite as: arXiv:2602.10538 [stat.ML]
  (or arXiv:2602.10538v3 [stat.ML] for this version)
  https://doi.org/10.48550/arXiv.2602.10538

arXiv-issued DOI via DataCite

Submission history

From: Sho Sonoda Dr [view email]
[v1] Wed, 11 Feb 2026 05:22:24 UTC (30 KB)
[v2] Thu, 12 Feb 2026 19:27:06 UTC (31 KB)
[v3] Fri, 22 May 2026 18:21:47 UTC (136 KB)