Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Diffusion-based generative models have demonstrated exceptional performance,
yet their iterative sampling procedures remain computationally expensive. A
prominent strategy to mitigate this cost is distillation, with offline
distillation offering particular advantages in terms of efficiency, modularity,
and flexibility. In this work, we identify two key observations that motivate a
principled distillation framework: (1) while diffusion models have been viewed
through the lens of dynamical systems theory, powerful and underexplored tools
can be further leveraged; and (2) diffusion models inherently impose
structured, semantically coherent trajectories in latent space. Building on
these observations, we introduce the Koopman Distillation Model (KDM), a novel
offline distillation approach grounded in Koopman theory - a classical
framework for representing nonlinear dynamics linearly in a transformed space.
KDM encodes noisy inputs into an embedded space where a learned linear operator
propagates them forward, followed by a decoder that reconstructs clean samples.
This enables single-step generation while preserving semantic fidelity. We
provide theoretical justification for our approach: (1) under mild assumptions,
the learned diffusion dynamics admit a finite-dimensional Koopman
representation; and (2) proximity in the Koopman latent space correlates with
semantic similarity in the generated outputs, allowing for effective trajectory
alignment. KDM achieves highly competitive performance across standard offline
distillation benchmarks.
Authors (5)
Nimrod Berman
Ilan Naiman
Moshe Eliasof
Hedi Zisling
Omri Azencot
Key Contributions
This paper introduces the Koopman Distillation Model (KDM), a novel offline distillation approach for diffusion models. KDM leverages Koopman theory to represent nonlinear dynamics linearly in a transformed space, enabling more efficient sampling by encoding noisy inputs into a learned linear operator. This approach addresses the computational expense of iterative sampling in diffusion models.
Business Value
Reduces the computational cost of generating high-quality content with diffusion models, making them more accessible for applications requiring faster inference, such as real-time image generation or interactive content creation.