Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: This paper studies the high-dimensional scaling limits of online stochastic
gradient descent (SGD) for single-layer networks. Building on the seminal work
of Saad and Solla, which analyzed the deterministic (ballistic) scaling limits
of SGD corresponding to the gradient flow of the population loss, we focus on
the critical scaling regime of the step size. Below this critical scale, the
effective dynamics are governed by ballistic (ODE) limits, but at the critical
scale, new correction term appears that changes the phase diagram. In this
regime, near the fixed points, the corresponding diffusive (SDE) limits of the
effective dynamics reduces to an Ornstein-Uhlenbeck process under certain
conditions. These results highlight how the information exponent controls
sample complexity and illustrates the limitations of deterministic scaling
limit in capturing the stochastic fluctuations of high-dimensional learning
dynamics.
Key Contributions
This paper analyzes the high-dimensional scaling limits of online SGD for single-layer networks, focusing on the critical step size regime. It shows that below this scale, dynamics follow ballistic (ODE) limits, but at the critical scale, new correction terms appear. Near fixed points, the diffusive (SDE) limits reduce to an Ornstein-Uhlenbeck process, revealing how the information exponent controls sample complexity and highlighting limitations of deterministic scaling limits.
Business Value
A deeper theoretical understanding of SGD dynamics can lead to the development of more efficient and stable training algorithms for deep learning models, potentially reducing training time and improving performance.