Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Weak-to-strong generalization refers to the phenomenon where a stronger model
trained under supervision from a weaker one can outperform its teacher. While
prior studies aim to explain this effect, most theoretical insights are limited
to abstract frameworks or linear/random feature models. In this paper, we
provide a formal analysis of weak-to-strong generalization from a linear CNN
(weak) to a two-layer ReLU CNN (strong). We consider structured data composed
of label-dependent signals of varying difficulty and label-independent noise,
and analyze gradient descent dynamics when the strong model is trained on data
labeled by the pretrained weak model. Our analysis identifies two regimes --
data-scarce and data-abundant -- based on the signal-to-noise characteristics
of the dataset, and reveals distinct mechanisms of weak-to-strong
generalization. In the data-scarce regime, generalization occurs via benign
overfitting or fails via harmful overfitting, depending on the amount of data,
and we characterize the transition boundary. In the data-abundant regime,
generalization emerges in the early phase through label correction, but we
observe that overtraining can subsequently degrade performance.
Authors (3)
Junsoo Oh
Jerry Song
Chulhee Yun
Submitted
October 28, 2025
Key Contributions
This paper provides a formal theoretical analysis of weak-to-strong generalization in deep learning, specifically for CNNs. It identifies distinct mechanisms and regimes (data-scarce vs. data-abundant) governing this phenomenon, offering insights into how a weaker model can improve a stronger one through feature learning and gradient descent dynamics.
Business Value
Enhances fundamental understanding of deep learning, potentially leading to more robust and efficient model training strategies in the future.