Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 90% Match Research Paper Machine Learning Researchers,Deep Learning Engineers,Theoretical Computer Scientists 2 weeks ago

The Spacetime of Diffusion Models: An Information Geometry Perspective

computer-vision β€Ί diffusion-models
πŸ“„ Abstract

Abstract: We present a novel geometric perspective on the latent space of diffusion models. We first show that the standard pullback approach, utilizing the deterministic probability flow ODE decoder, is fundamentally flawed. It provably forces geodesics to decode as straight segments in data space, effectively ignoring any intrinsic data geometry beyond the ambient Euclidean space. Complementing this view, diffusion also admits a stochastic decoder via the reverse SDE, which enables an information geometric treatment with the Fisher-Rao metric. However, a choice of $x_T$ as the latent representation collapses this metric due to memorylessness. We address this by introducing a latent spacetime $z=(x_t,t)$ that indexes the family of denoising distributions $p(x_0 | x_t)$ across all noise scales, yielding a nontrivial geometric structure. We prove these distributions form an exponential family and derive simulation-free estimators for curve lengths, enabling efficient geodesic computation. The resulting structure induces a principled Diffusion Edit Distance, where geodesics trace minimal sequences of noise and denoise edits between data. We also demonstrate benefits for transition path sampling in molecular systems, including constrained variants such as low-variance transitions and region avoidance. Code is available at: https://github.com/rafalkarczewski/spacetime-geometry
Authors (5)
RafaΕ‚ Karczewski
Markus Heinonen
Alison Pouplin
SΓΈren Hauberg
Vikas Garg
Submitted
May 23, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

This paper introduces a novel information geometric perspective on the latent space of diffusion models. It identifies flaws in the standard pullback approach and proposes a latent spacetime representation that enables a non-trivial geometric structure, allowing for efficient geodesic computation and a deeper understanding of the denoising process.

Business Value

Provides a deeper theoretical understanding of diffusion models, which could lead to more efficient and controllable generative models for various applications like image synthesis, drug discovery, and material design.