Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Given $d$-dimensional standard Gaussian vectors $\boldsymbol{x}_1,\dots,
\boldsymbol{x}_n$, we consider the set of all empirical distributions of its
$m$-dimensional projections, for $m$ a fixed constant. Diaconis and Freedman
(1984) proved that, if $n/d\to \infty$, all such distributions converge to the
standard Gaussian distribution. In contrast, we study the proportional
asymptotics, whereby $n,d\to \infty$ with $n/d\to \alpha \in (0, \infty)$. In
this case, the projection of the data points along a typical random subspace is
again Gaussian, but the set $\mathscr{F}_{m,\alpha}$ of all probability
distributions that are asymptotically feasible as $m$-dimensional projections
contains non-Gaussian distributions corresponding to exceptional subspaces.
Non-rigorous methods from statistical physics yield an indirect
characterization of $\mathscr{F}_{m,\alpha}$ in terms of a generalized Parisi
formula. Motivated by the goal of putting this formula on a rigorous basis, and
to understand whether these projections can be found efficiently, we study the
subset $\mathscr{F}^{\rm alg}_{m,\alpha}\subseteq \mathscr{F}_{m,\alpha}$ of
distributions that can be realized by a class of iterative algorithms. We prove
that this set is characterized by a certain stochastic optimal control problem,
and obtain a dual characterization of this problem in terms of a variational
principle that extends Parisi's formula.
As a byproduct, we obtain computationally achievable values for a class of
random optimization problems including `generalized spherical perceptron'
models.
Authors (2)
Andrea Montanari
Kangjie Zhou
Key Contributions
This paper rigorously characterizes the set of feasible probability distributions of m-dimensional projections of d-dimensional Gaussian vectors when both n (number of vectors) and d (dimension) go to infinity proportionally. It connects this problem to statistical physics and the generalized Parisi formula, providing a rigorous basis for understanding non-Gaussian distributions arising from exceptional subspaces.
Business Value
Provides fundamental insights into the behavior of high-dimensional data under projection, which can inform dimensionality reduction techniques and statistical modeling in complex systems.