Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: Feature learning (FL), where neural networks adapt their internal
representations during training, remains poorly understood. Using methods from
statistical physics, we derive a tractable, self-consistent mean-field (MF)
theory for the Bayesian posterior of two-layer non-linear networks trained with
stochastic gradient Langevin dynamics (SGLD). At infinite width, this theory
reduces to kernel ridge regression, but at finite width it predicts a symmetry
breaking phase transition where networks abruptly align with target functions.
While the basic MF theory provides theoretical insight into the emergence of FL
in the finite-width regime, semi-quantitatively predicting the onset of FL with
noise or sample size, it substantially underestimates the improvements in
generalisation after the transition. We trace this discrepancy to a key
mechanism absent from the plain MF description: \textit{self-reinforcing input
feature selection}. Incorporating this mechanism into the MF theory allows us
to quantitatively match the learning curves of SGLD-trained networks and
provides mechanistic insight into FL.
Authors (4)
Niclas GΓΆring
Chris Mingard
Yoonsoo Nam
Ard Louis
Submitted
October 16, 2025
Key Contributions
This paper derives a tractable mean-field (MF) theory for feature learning in two-layer neural networks trained with SGLD, using methods from statistical physics. It predicts a phase transition where networks align with target functions and identifies 'self-reinforcing input feature selection' as a key mechanism absent in basic MF theory, which is crucial for explaining improved generalization.
Business Value
Deeper theoretical understanding of how neural networks learn representations, potentially leading to more efficient and effective model design and training.