arxiv_ai 95% Match Research Paper Machine learning researchers,AI developers,Researchers in NLP and generative models 1 week ago

Variational Masked Diffusion Models

generative-ai › diffusion

📄 Abstract

Abstract: Masked diffusion models have recently emerged as a flexible framework for discrete generative modeling. However, a key limitation of standard masked diffusion is its inability to effectively capture dependencies among tokens that are predicted concurrently, leading to degraded generation quality when dependencies among tokens are important. To explicitly model dependencies among tokens, we propose Variational Masked Diffusion (VMD), a framework that introduces latent variables into the masked diffusion process. Through controlled experiments on synthetic datasets, we demonstrate that VMD successfully learns dependencies that conventional masked diffusion fails to capture. We further validate the effectiveness of our approach on Sudoku puzzles and text datasets, where learning of dependencies among tokens improves global consistency. Across these domains, VMD enhances both generation quality and dependency awareness, highlighting the value of integrating variational inference into masked diffusion. Our code is available at: https://riccizz.github.io/VMD.

Authors (3)

Yichi Zhang

Alex Schwing

Zhizhen Zhao

Submitted

October 27, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

Introduces Variational Masked Diffusion (VMD), a framework that enhances masked diffusion models by incorporating latent variables. This allows VMD to explicitly model dependencies among concurrently predicted tokens, leading to improved generation quality and global consistency, particularly in domains like text and combinatorial puzzles.

Business Value

Enables the generation of more coherent and contextually relevant discrete data, useful for applications like creative writing, code generation, and solving complex combinatorial problems.

Paper Metadata

Innovation Type

Methodological

Deployment Feasibility

Medium, requires integration into existing diffusion model pipelines.

Limitations Addressed

Standard masked diffusion models struggle to capture dependencies among concurrently predicted tokens, leading to degraded generation quality when such dependencies are important.

Performance Gains

VMD successfully learns dependencies that conventional masked diffusion fails to capture, improving generation quality and dependency awareness across domains.

Technical Tags

masked diffusion modelsdiscrete generative modelingvariational inferencelatent variablestoken dependenciesglobal consistencySudoku puzzlestext generationgenerative quality

Research Topics

Generative ModelsDiffusion ModelsVariational InferenceDiscrete Data ModelingSequence Modeling

Methods & Architectures

Variational Masked Diffusion (VMD)Integration of latent variablesControlled experiments Masked Diffusion ModelsVariational Autoencoders (implied by variational inference)

Applications & Tasks

Natural Language Processing Combinatorial Optimization Data Generation Improving discrete generative modelingCapturing dependencies among concurrently predicted tokensEnhancing global consistency in generated sequences Discrete generative modelingText generationSolving combinatorial problems (e.g., Sudoku)

Datasets & Benchmarks

Datasets

Synthetic datasets, Sudoku puzzles, Text datasets

Generation qualityDependency awarenessGlobal consistency

Related Fields

Machine LearningDeep LearningGenerative ModelsProbabilistic Modeling

Keywords

diffusion modelsmasked diffusionvariational inferencegenerative modelsdiscrete datatoken dependenciestext generationSudokuVMDlatent variablesconsistency

Academic Context

#Generative Models#Diffusion Models#Variational Inference#Discrete Data Modeling#Sequence Modeling

Commercial Potential

Potential Products

Advanced text generation modelsTools for generating structured dataAI solvers for combinatorial problems

Target Industries

TechnologySoftware DevelopmentGamingResearch

Use Case Examples

Generating coherent paragraphs of text where sentence structure and meaning are well-preserved.Creating valid and complete Sudoku puzzles with complex logical dependencies.

Competitive Edge

Addresses a key limitation of existing masked diffusion models by explicitly modeling token dependencies, leading to potentially higher quality and more consistent discrete data generation.

Market Opportunity

Large and growing market for generative AI technologies.

Revenue Models

Licensing modelsproviding API access.

Resource Requirements

Compute Needs

High, typical for diffusion models.

Data Requirements

Discrete datasets (text, puzzles, etc.) for training.

Deployment Constraints

Computational cost of diffusion models.

Scalability

Scalability depends on the efficiency of the VMD implementation.

Production Readiness

Maturity Level

Research

Time to Market

Medium

View Full Paper Back to Papers