Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Discrete diffusion models offer a flexible, controllable approach to
structured sequence generation, yet they still lag behind causal language
models in expressive power. A key limitation lies in their reliance on the
Markovian assumption, which restricts each step to condition only on the
current state, leading to potential uncorrectable error accumulation. In this
paper, we introduce CaDDi (Causal Discrete Diffusion Model), a discrete
diffusion model that conditions on the entire generative trajectory, thereby
lifting the Markov constraint and allowing the model to revisit and improve
past states. By unifying sequential (causal) and temporal (diffusion) reasoning
in a single non-Markovian transformer, CaDDi also treats standard causal
language models as a special case and permits the direct reuse of pretrained
LLM weights with no architectural changes. Empirically, CaDDi outperforms
state-of-the-art discrete diffusion baselines on natural-language benchmarks,
substantially narrowing the remaining gap to large autoregressive transformers.
Authors (10)
Yangtian Zhang
Sizhuang He
Daniel Levine
Lawrence Zhao
David Zhang
Syed A Rizvi
+4 more
Submitted
February 13, 2025
Key Contributions
Introduces CaDDi, a non-Markovian discrete diffusion model that conditions on the entire generative trajectory, lifting the Markov constraint and unifying sequential and temporal reasoning within a transformer architecture, enabling direct reuse of pretrained LLM weights.
Business Value
Enables more powerful and controllable generation of structured text, useful for applications like creative writing, code generation, and complex dialogue systems.