Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 95% Match Research Paper ML Researchers,Generative AI Researchers,Students in ML Theory 1 week ago

Information-Theoretic Discrete Diffusion

generative-ai › diffusion
📄 Abstract

Abstract: We present an information-theoretic framework for discrete diffusion models that yields principled estimators of log-likelihood using score-matching losses. Inspired by the I-MMSE identity for the Gaussian setup, we derive analogous results for the discrete setting. Specifically, we introduce the Information-Minimum Denoising Score Entropy (I-MDSE) relation, which links mutual information between data and its diffused version to the minimum denoising score entropy (DSE) loss. We extend this theory to masked diffusion and establish the Information-Minimum Denoising Cross-Entropy (I-MDCE) relation, connecting cross-entropy losses to mutual information in discrete masked processes. These results provide a time-integral decomposition of the log-likelihood of the data in terms of optimal score-based losses, showing that commonly used losses such as DSE and DCE are not merely variational bounds but tight and principled estimators of log-likelihood. The I-MDCE decomposition further enables practical extensions, including time-free formula, conditional likelihood estimation in prompt-response tasks, and coupled Monte Carlo estimation of likelihood ratios. Experiments on synthetic and real-world data confirm the accuracy, variance stability, and utility of our estimators. The code is publicly available at https://github.com/Dongjae0324/infodis.
Authors (4)
Moongyu Jeon
Sangwoo Shin
Dongjae Jeon
Albert No
Submitted
October 28, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

This paper introduces an information-theoretic framework for discrete diffusion models, providing principled estimators of log-likelihood using score-matching losses. It derives the Information-Minimum Denoising Score Entropy (I-MDSE) and Information-Minimum Denoising Cross-Entropy (I-MDCE) relations, showing that common losses like DSE and DCE are tight estimators of log-likelihood, not just variational bounds.

Business Value

Enables more reliable and interpretable generative models for discrete data, potentially improving applications in areas like natural language generation and molecular design.