Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: We present a novel geometric perspective on the latent space of diffusion
models. We first show that the standard pullback approach, utilizing the
deterministic probability flow ODE decoder, is fundamentally flawed. It
provably forces geodesics to decode as straight segments in data space,
effectively ignoring any intrinsic data geometry beyond the ambient Euclidean
space. Complementing this view, diffusion also admits a stochastic decoder via
the reverse SDE, which enables an information geometric treatment with the
Fisher-Rao metric. However, a choice of $x_T$ as the latent representation
collapses this metric due to memorylessness. We address this by introducing a
latent spacetime $z=(x_t,t)$ that indexes the family of denoising distributions
$p(x_0 | x_t)$ across all noise scales, yielding a nontrivial geometric
structure. We prove these distributions form an exponential family and derive
simulation-free estimators for curve lengths, enabling efficient geodesic
computation. The resulting structure induces a principled Diffusion Edit
Distance, where geodesics trace minimal sequences of noise and denoise edits
between data. We also demonstrate benefits for transition path sampling in
molecular systems, including constrained variants such as low-variance
transitions and region avoidance. Code is available at:
https://github.com/rafalkarczewski/spacetime-geometry
Authors (5)
RafaΕ Karczewski
Markus Heinonen
Alison Pouplin
SΓΈren Hauberg
Vikas Garg
Key Contributions
This paper introduces a novel information geometric perspective on the latent space of diffusion models. It identifies flaws in the standard pullback approach and proposes a latent spacetime representation that enables a non-trivial geometric structure, allowing for efficient geodesic computation and a deeper understanding of the denoising process.
Business Value
Provides a deeper theoretical understanding of diffusion models, which could lead to more efficient and controllable generative models for various applications like image synthesis, drug discovery, and material design.