Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 70% Match Research Paper Statisticians,Probabilists,Theoretical Machine Learning Researchers,Physicists 2 weeks ago

Which exceptional low-dimensional projections of a Gaussian point cloud can be found in polynomial time?

graph-neural-networks › graph-learning
📄 Abstract

Abstract: Given $d$-dimensional standard Gaussian vectors $\boldsymbol{x}_1,\dots, \boldsymbol{x}_n$, we consider the set of all empirical distributions of its $m$-dimensional projections, for $m$ a fixed constant. Diaconis and Freedman (1984) proved that, if $n/d\to \infty$, all such distributions converge to the standard Gaussian distribution. In contrast, we study the proportional asymptotics, whereby $n,d\to \infty$ with $n/d\to \alpha \in (0, \infty)$. In this case, the projection of the data points along a typical random subspace is again Gaussian, but the set $\mathscr{F}_{m,\alpha}$ of all probability distributions that are asymptotically feasible as $m$-dimensional projections contains non-Gaussian distributions corresponding to exceptional subspaces. Non-rigorous methods from statistical physics yield an indirect characterization of $\mathscr{F}_{m,\alpha}$ in terms of a generalized Parisi formula. Motivated by the goal of putting this formula on a rigorous basis, and to understand whether these projections can be found efficiently, we study the subset $\mathscr{F}^{\rm alg}_{m,\alpha}\subseteq \mathscr{F}_{m,\alpha}$ of distributions that can be realized by a class of iterative algorithms. We prove that this set is characterized by a certain stochastic optimal control problem, and obtain a dual characterization of this problem in terms of a variational principle that extends Parisi's formula. As a byproduct, we obtain computationally achievable values for a class of random optimization problems including `generalized spherical perceptron' models.
Authors (2)
Andrea Montanari
Kangjie Zhou
Submitted
June 5, 2024
arXiv Category
math.PR
arXiv PDF

Key Contributions

This paper rigorously characterizes the set of feasible probability distributions of m-dimensional projections of d-dimensional Gaussian vectors when both n (number of vectors) and d (dimension) go to infinity proportionally. It connects this problem to statistical physics and the generalized Parisi formula, providing a rigorous basis for understanding non-Gaussian distributions arising from exceptional subspaces.

Business Value

Provides fundamental insights into the behavior of high-dimensional data under projection, which can inform dimensionality reduction techniques and statistical modeling in complex systems.