





















Abstract:Distributed stochastic optimization intertwines (i) stochastic gradient noise, (ii) communication compression, and (iii) adaptive/normalized updates. While each factor has been studied in isolation, their joint effect under realistic assumptions remains poorly understood. In this work, we develop a unified theoretical framework for Distributed Compressed SGD (DCSGD) and its sign variant Distributed SignSGD (DSignSGD) under the recently introduced $(L_0, L_1)$-smoothness condition. From a conceptual perspective, we show that the first- and second-order modified equations from the literature do not accurately model the discrete-time step-size/stability restrictions, especially under $(L_0,L_1)$-smoothness. From a technical perspective, we propose new first-order SDEs by carefully incorporating curvature-dependent terms into their drift: This helps capture the fine-grained relationship between learning rate restrictions, gradient noise, compression, and the geometry of the loss landscape. Importantly, we do so under general gradient noise assumptions, including heavy-tailed and affine-variance regimes, which extend beyond the classical bounded-variance setting. Our results suggest that normalizing the updates of DCSGD emerges as a natural condition for stability, with the degree of normalization precisely determined by the gradient noise structure, the landscape's regularity, and the compression rate. In contrast, DSignSGD converges even under heavy-tailed noise with standard learning rate schedules. Together, these findings offer both new theoretical insights and perspectives, and practical guidance.
| Comments: | Accepted at ICML 2026 (Poster) |
| Subjects: | Machine Learning (cs.LG); Machine Learning (stat.ML) |
| Cite as: | arXiv:2506.00181 [cs.LG] |
| (or arXiv:2506.00181v2 [cs.LG] for this version) | |
| https://doi.org/10.48550/arXiv.2506.00181 arXiv-issued DOI via DataCite |
From: Enea Monzio Compagnoni Mr. [view email]
[v1]
Fri, 30 May 2025 19:35:15 UTC (522 KB)
[v2]
Fri, 22 May 2026 19:40:10 UTC (2,632 KB)
此内容由惯性聚合(RSS阅读器)自动聚合整理,仅供阅读参考。 原文来自 — 版权归原作者所有。