惯性聚合 高效追踪和阅读你感兴趣的博客、新闻、科技资讯
阅读原文 在惯性聚合中打开

推荐订阅源

A
Arctic Wolf
V
V2EX
P
Proofpoint News Feed
The Hacker News
The Hacker News
GbyAI
GbyAI
G
Google Developers Blog
S
Schneier on Security
freeCodeCamp Programming Tutorials: Python, JavaScript, Git & More
W
WeLiveSecurity
Security Archives - TechRepublic
Security Archives - TechRepublic
博客园 - Franky
Recent Announcements
Recent Announcements
腾讯CDC
Hacker News - Newest:
Hacker News - Newest: "LLM"
K
Kaspersky official blog
U
Unit 42
Engineering at Meta
Engineering at Meta
J
Java Code Geeks
Google Online Security Blog
Google Online Security Blog
Last Week in AI
Last Week in AI
V
Vulnerabilities – Threatpost
N
News and Events Feed by Topic
O
OpenAI News
量子位
cs.CL updates on arXiv.org
cs.CL updates on arXiv.org
Y
Y Combinator Blog
博客园 - 【当耐特】
Vercel News
Vercel News
Hacker News: Ask HN
Hacker News: Ask HN
T
Tor Project blog
Apple Machine Learning Research
Apple Machine Learning Research
Microsoft Security Blog
Microsoft Security Blog
Exploit-DB.com RSS Feed
Exploit-DB.com RSS Feed
AWS News Blog
AWS News Blog
MongoDB | Blog
MongoDB | Blog
S
Security Affairs
A
About on SuperTechFans
Project Zero
Project Zero
D
Darknet – Hacking Tools, Hacker News & Cyber Security
博客园 - 聂微东
Webroot Blog
Webroot Blog
奇客Solidot–传递最新科技情报
奇客Solidot–传递最新科技情报
Cloudbric
Cloudbric
T
Tenable Blog
月光博客
月光博客
C
Check Point Blog
宝玉的分享
宝玉的分享
V
Visual Studio Blog
T
The Blog of Author Tim Ferriss
NISL@THU
NISL@THU

math.PR updates on arXiv.org

Bayesian 3D Steerable CNNs: Enabling Equivariance and Uncertainty Quantification Simultaneously Delayed acceptance sampling with Hamiltonian proposal subchains for random field materials inference Risk or Replace: Efficient Asymptotics for Data-Driven Maintenance Joint convergence in Wiener chaos via transport hierarchy and Malliavin covariances High-Order Talagrand and Eldan--Gross Inequalities via Besov-Type Variance Functionals Phase Transition in Convex Relaxations for Graph Alignment Effective Resistances and Commute Times in Sparse Random Geometric Graphs The Ornstein$-$Uhlenbeck process on $\mathscr P_2$ with a volatility operator Higher-order spectral perturbation expansions II: Kernel matrices and manifold learning An Analytical Methodology for Quantifying Airspace Conflict Rate and Complexity A uniform-in-time weakly convergent explicit numerical method for the underdamped Langevin equation with polynomial potentials The distribution of the de Moivre experiment The existence of invariant sublinear expectations for $G$-SDEs On the Smoluchowski-Kramers approximation for the hyperbolic $O(N)$ linear sigma model and its mean-field limit Sharp freezing time estimates for the subcritical Facilitated Exclusion Process Collapsibility in Multiparametric Models of Random Simplicial Complexes Testing for a Hidden Geometry in Random Graphs Uniform integrability of the distance to the nearest leaf in random trees Well-posedness of stochastic parabolic equations with gradient nonlinearities and applications to phase-field models A Concavity Theorem for the Parisi PDE Stein's method for the matrix normal distribution Interplay of insurance and financial risks in a non Levy-Renewal environment Excursion Fluctuations and Spectral Universality in Gaussian Fields Purely unrectifiable sets, fractal percolation and graphs of functions Flowing to Normality and the Fate of the Single Ring Theorem A small noise approximation for Muller's Ratchet Exponential Convengence of DLRA for SDEs Sharp One-Dimensional Sub-Gaussian Comparison in Convex Order A 0-1 Law for Multifractal Spectra via the HGDS Scale Derivative Small moments of the sensitivity of polynomial threshold functions An Algebraic Matrix Spencer Theorem A non-asymptotic bound on the TV distance between a Wishart matrix and an appropriately scaled GOE matrix Eyring-Kramers asymptotics for infinite-dimensional stochastic gradient systems A Low-Regularity Semigroup Sewing Lemma via Quotient Structures The Backward Stochastic Partial Differential Integral Equations: Solvability and Comparison Principle Super-Arrhenius relaxation of the triangular plaquette model in any dimension Logarithmic Large Deviations for Heavy-Tailed Sums On stability of outliers from the circular law Free energy of non-convex multi-species spin glasses with centered Ising spins Random Tensor Estimates and Deterministic Diagonal Resolutions for Mixed Paracontrolled Operators Quantitative Oppenheim Conjecture for Random Quadratic Forms and Optimal Variance Bounds in Function Fields Plateau Gaps of Poisson Correctors Encode Metastable Reaction Rates Pricing Excess-of-Loss Reinsurance and CAT Bonds under Climate Uncertainty: A Cox Process Framework with Temperature-Dependent Stochastic Intensity A Machine-Checked Itô Calculus for Brownian Motion Universality in the target arrival statistics of non-conservative search processes Lyapunov-Based Sample Complexity Analysis for Weakly-Coupled MDPs Semiclassical limit of Polyakov-Liouville measure and Q-Curvature Uniformization on evev-dimensional manifolds Storage and Transport Capacity Design for a Self-Reliable Two-Node Stochastic Resource System The optimal sub-Gaussian normalisation for randomised monotone functions On the empirical spectral distribution of matrix perpetuities Experimentation for Different Scheduling Policies on Queues: Mixed Differences-in-Q Estimators Based on Little's Law Eigen-Spike Emergence and Quadratic Equivalents for Conjugate Kernels on Nonlinearly Separable Data
Dropout Neural Network Training Viewed from a Percolation Perspective
[Submitted on 15 Dec 2025 (v1), last revised 15 Jun 2026 (this v · 2026-06-17 · via math.PR updates on arXiv.org

View PDF

Abstract:In this work, we investigate the existence and effect of percolation in training deep Neural Networks (NNs) with dropout. Dropout methods are regularisation techniques for training NNs, first introduced by G. Hinton et al. (2012). These methods temporarily remove connections in the NN, randomly at each stage of training, and update the remaining subnetwork with Stochastic Gradient Descent (SGD). The process of removing connections from a network at random is similar to percolation, a paradigm model of statistical physics.
If dropout were to remove enough connections such that there is no path between the input and output of the NN, then the NN could not make predictions informed by the data. We study new percolation models that mimic dropout in NNs and characterise the relationship between network topology and this path problem. The theory shows the existence of a percolative effect in dropout. We also show that this percolative effect can cause a breakdown when training NNs without biases with dropout; and we argue heuristically that this breakdown extends to NNs with biases.

Submission history

From: Jaron Sanders [view email]
[v1] Mon, 15 Dec 2025 19:39:25 UTC (756 KB)
[v2] Mon, 15 Jun 2026 19:28:47 UTC (580 KB)