arxiv_ml 95% Match Research Paper ML Researchers,AI Researchers,NLP Researchers,Computer Vision Researchers,Generative Model Developers 1 week ago

CANDI: Hybrid Discrete-Continuous Diffusion Models

generative-ai › diffusion-models

📄 Abstract

Abstract: While continuous diffusion has shown remarkable success in continuous domains such as image generation, its direct application to discrete data has underperformed compared to purely discrete formulations. This gap is counterintuitive, given that continuous diffusion learns score functions that enable joint evolution across multiple positions. To understand this gap, we introduce token identifiability as an analytical framework for understanding how Gaussian noise corrupts discrete data through two mechanisms: discrete identity corruption and continuous rank degradation. We reveal that these mechanisms scale differently with vocabulary size, creating a temporal dissonance: at noise levels where discrete corruption preserves enough structure for conditional learning, continuous denoising is trivial; at noise levels where continuous denoising is meaningful, discrete corruption destroys nearly all conditional structure. To solve this, we propose CANDI (Continuous ANd DIscrete diffusion), a hybrid framework that decouples discrete and continuous corruption, enabling simultaneous learning of both conditional structure and continuous geometry. We empirically validate the temporal dissonance phenomenon and demonstrate that CANDI successfully avoids it. This unlocks the benefits of continuous diffusion for discrete spaces: on controlled generation, CANDI enables classifier-based guidance with off-the-shelf classifiers through simple gradient addition; on text generation, CANDI outperforms masked diffusion at low NFE, demonstrating the value of learning continuous gradients for discrete spaces. We include the code on the project page available here: https://patrickpynadath1.github.io/candi-lander

Authors (3)

Patrick Pynadath

Jiaxin Shi

Ruqi Zhang

Submitted

October 26, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

CANDI proposes a hybrid discrete-continuous diffusion framework that effectively bridges the gap between continuous diffusion models and discrete data. By analyzing token identifiability and noise corruption mechanisms, it enables high-quality generation for discrete domains, outperforming purely discrete methods.

Business Value

Enables the creation of more sophisticated generative AI applications for text, code, and other discrete data types, leading to advancements in content creation, drug discovery, and personalized experiences.

Paper Metadata

Innovation Type

Algorithmic Innovation

Deployment Feasibility

Moderate. Diffusion models can be computationally intensive for training and inference, but CANDI's hybrid approach might offer efficiency gains.

Limitations Addressed

Underperformance of continuous diffusion on discrete data,Gap between discrete identity corruption and continuous rank degradation,Temporal dissonance in noise levels for discrete vs. continuous denoising,Difficulty in applying continuous diffusion principles to discrete sequences

Performance Gains

Improved generation quality for discrete data,Better performance compared to purely discrete diffusion models,Enables leveraging continuous diffusion benefits for discrete tasks

Technical Tags

diffusion modelsdiscrete datacontinuous datahybrid modelstoken identifiabilityscore functionsgenerative modelingtext generationsequence modeling

Research Topics

Generative AIDiffusion ModelsDeep LearningSequence ModelingNatural Language ProcessingComputer Vision

Methods & Architectures

Hybrid discrete-continuous diffusion processToken identifiability analysisScore function learningNoise corruption mechanisms (discrete identity, continuous rank) Diffusion ModelsHybrid architectures

Applications & Tasks

Natural Language Generation Image Generation Sequence Generation Drug Discovery Material Science Generating high-quality discrete data using continuous diffusion principlesAddressing the performance gap between continuous and discrete diffusion modelsModeling data with both discrete and continuous propertiesImproving generative models for sequences Text generationSequence generationImage synthesisGenerating structured data

Related Fields

Generative AIDeep LearningMachine LearningNatural Language ProcessingComputer VisionDiffusion Models

Keywords

diffusion modelsdiscrete datacontinuous datahybridgenerative AItext generationsequence modelingtoken identifiabilityscore functionsdeep learning

Academic Context

#Generative AI#Diffusion Models#Deep Learning#Sequence Modeling#Natural Language Processing#Computer Vision

Technology Stack

Frameworks & Libraries

PyTorch

Programming Languages

Python

Commercial Potential

Potential Products

Advanced text generation modelsTools for generating molecular structuresCreative content generation platforms

Target Industries

TechnologyMedia and EntertainmentPharmaceuticalsMaterials ScienceGaming

Use Case Examples

Generating realistic dialogue for chatbots or virtual assistants.Designing novel drug molecules with desired properties.Creating synthetic datasets for training other ML models.

Competitive Edge

Offers a novel approach to generative modeling for discrete data by combining the strengths of continuous diffusion with discrete data handling, potentially surpassing existing methods.

Market Opportunity

Rapidly growing market for generative AI and diffusion models.

Revenue Models

Licensing of generative modelsAPI accessdevelopment of specialized generative tools.

Resource Requirements

Compute Needs

High (GPU required for training)

Data Requirements

Large datasets of discrete data (e.g., text corpora, molecular structures) and potentially continuous data.

Deployment Constraints

Inference can be slow for diffusion models; requires careful tuning for specific discrete data types.

Scalability

Scalability depends on the complexity of the discrete data and the chosen diffusion model architecture.

Production Readiness

Maturity Level

Research

Time to Market

2-4 years for robust applications in specialized domains.

Patent Potential

Moderate, for the hybrid diffusion framework and analysis of noise corruption.

View Full Paper Back to Papers