arxiv_ml 75% Match Research Paper Machine learning theorists,Statisticians,Researchers in optimization 2 weeks ago

Statistical Inference for Linear Functionals of Online Least-squares SGD when $t \gtrsim d^{1+\delta}$

reinforcement-learning › offline-rl

📄 Abstract

Abstract: Stochastic Gradient Descent (SGD) has become a cornerstone method in modern data science. However, deploying SGD in high-stakes applications necessitates rigorous quantification of its inherent uncertainty. In this work, we establish \emph{non-asymptotic Berry--Esseen bounds} for linear functionals of online least-squares SGD, thereby providing a Gaussian Central Limit Theorem (CLT) in a \emph{growing-dimensional regime}. Existing approaches to high-dimensional inference for projection parameters, such as~\cite{chang2023inference}, rely on inverting empirical covariance matrices and require at least $t \gtrsim d^{3/2}$ iterations to achieve finite-sample Berry--Esseen guarantees, rendering them computationally expensive and restrictive in the allowable dimensional scaling. In contrast, we show that a CLT holds for SGD iterates when the number of iterations grows as $t \gtrsim d^{1+\delta}$ for any $\delta > 0$, significantly extending the dimensional regime permitted by prior works while improving computational efficiency. The proposed online SGD-based procedure operates in $\mathcal{O}(td)$ time and requires only $\mathcal{O}(d)$ memory, in contrast to the $\mathcal{O}(td^2 + d^3)$ runtime of covariance-inversion methods. To render the theory practically applicable, we further develop an \emph{online variance estimator} for the asymptotic variance appearing in the CLT and establish \emph{high-probability deviation bounds} for this estimator. Collectively, these results yield the first fully online and data-driven framework for constructing confidence intervals for SGD iterates in the near-optimal scaling regime $t \gtrsim d^{1+\delta}$.

Authors (3)

Bhavya Agrawalla

Krishnakumar Balasubramanian

Promit Ghosal

Submitted

October 22, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

Establishes non-asymptotic Berry-Esseen bounds for linear functionals of online least-squares SGD in a growing-dimensional regime ($t \gtrsim d^{1+\delta}$). This significantly extends the dimensional scaling previously possible for finite-sample guarantees, offering a more computationally efficient approach than methods requiring matrix inversion.

Business Value

Provides stronger theoretical foundations for using SGD in high-stakes applications by offering reliable uncertainty quantification, which is crucial for trust and safety in AI systems.

Paper Metadata

Innovation Type

Theoretical/Algorithmic

Deployment Feasibility

Theoretical contribution, enabling future practical implementations that require robust uncertainty estimates.

Limitations Addressed

Addresses the limitations of existing high-dimensional inference methods for SGD, which require $t \gtrsim d^{3/2}$ iterations and rely on computationally expensive matrix inversions, making them restrictive for large-scale problems.

Technical Tags

stochastic gradient descent (SGD)online learningleast-squaresstatistical inferenceBerry-Esseen boundsgrowing dimensionsGaussian CLThigh-dimensional inference

Research Topics

Optimization TheoryStatistical InferenceHigh-Dimensional StatisticsMachine Learning Theory

Methods & Architectures

Online Least-Squares SGDBerry-Esseen boundsGaussian Central Limit Theorem (CLT)

Applications & Tasks

High-stakes applications Data Science Quantifying uncertainty in SGDStatistical inference in high dimensionsDeveloping finite-sample guarantees Providing Gaussian CLT in growing-dimensional regimesEnabling reliable uncertainty quantification for SGD

Related Fields

StatisticsOptimizationEconometricsMachine Learning Theory

Keywords

SGDstochastic gradient descentonline learningstatistical inferenceBerry-Esseencentral limit theoremhigh-dimensional statisticsgrowing dimensionsuncertainty quantificationleast squares

Academic Context

#Optimization Theory#Statistical Inference#High-Dimensional Statistics#Machine Learning Theory

Commercial Potential

Target Industries

FinanceHealthcareAny industry using large-scale ML

Use Case Examples

Providing confidence intervals for predictions made by large-scale online learning systems.Enabling risk assessment in financial modeling using SGD.

Competitive Edge

Offers a more computationally efficient and broader dimensional regime for statistical inference in SGD compared to prior work.

Market Opportunity

Underpins the reliability of a vast range of ML applications.

Revenue Models

Indirectly enhances the value and trustworthiness of ML platforms.

Resource Requirements

Compute Needs

Theoretical analysis, not directly tied to specific compute requirements.

Data Requirements

Requires data suitable for online least-squares SGD.

Deployment Constraints

The theoretical results apply under specific assumptions of the SGD process.

Scalability

The paper focuses on the scalability of inference guarantees with respect to dimension.

Regulatory Considerations

Crucial for applications requiring certified robustness or uncertainty quantification.

Production Readiness

Maturity Level

Theoretical Research

Time to Market

Long-term, as it's a theoretical foundation.

View Full Paper Back to Papers