Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 95% Match Research Paper AI Researchers,Generative Model Developers,Scientists in simulation-heavy fields 3 weeks ago

Contrastive Diffusion Alignment: Learning Structured Latents for Controllable Generation

generative-ai › diffusion
📄 Abstract

Abstract: Diffusion models excel at generation, but their latent spaces are not explicitly organized for interpretable control. We introduce ConDA (Contrastive Diffusion Alignment), a framework that applies contrastive learning within diffusion embeddings to align latent geometry with system dynamics. Motivated by recent advances showing that contrastive objectives can recover more disentangled and structured representations, ConDA organizes diffusion latents such that traversal directions reflect underlying dynamical factors. Within this contrastively structured space, ConDA enables nonlinear trajectory traversal that supports faithful interpolation, extrapolation, and controllable generation. Across benchmarks in fluid dynamics, neural calcium imaging, therapeutic neurostimulation, and facial expression, ConDA produces interpretable latent representations with improved controllability compared to linear traversals and conditioning-based baselines. These results suggest that diffusion latents encode dynamics-relevant structure, but exploiting this structure requires latent organization and traversal along the latent manifold.
Authors (12)
Ruchi Sandilya
Sumaira Perez
Charles Lynch
Lindsay Victoria
Benjamin Zebley
Derrick Matthew Buchanan
+6 more
Submitted
October 16, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Introduces ConDA, a framework that uses contrastive learning within diffusion model latents to align them with system dynamics, enabling structured and controllable generation. It demonstrates improved interpretability and controllability through nonlinear trajectory traversal in the latent space across diverse domains.

Business Value

Enables more precise and interpretable control over generative models, useful for applications requiring specific outputs, such as synthetic data generation for scientific research or personalized content creation.