Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: In image generation, Multiple Latent Variable Generative Models (MLVGMs)
employ multiple latent variables to gradually shape the final images, from
global characteristics to finer and local details (e.g., StyleGAN, NVAE),
emerging as powerful tools for diverse applications. Yet their generative
dynamics remain only empirically observed, without a systematic understanding
of each latent variable's impact.
In this work, we propose a novel framework that quantifies the contribution
of each latent variable using Mutual Information (MI) as a metric. Our analysis
reveals that current MLVGMs often underutilize some latent variables, and
provides actionable insights for their use in downstream applications. With
this foundation, we introduce a method for generating synthetic data for
Self-Supervised Contrastive Representation Learning (SSCRL). By leveraging the
hierarchical and disentangled variables of MLVGMs, our approach produces
diverse and semantically meaningful views without the need for real image data.
Additionally, we introduce a Continuous Sampling (CS) strategy, where the
generator dynamically creates new samples during SSCRL training, greatly
increasing data variability. Our comprehensive experiments demonstrate the
effectiveness of these contributions, showing that MLVGMs' generated views
compete on par with or even surpass views generated from real data.
This work establishes a principled approach to understanding and exploiting
MLVGMs, advancing both generative modeling and self-supervised learning. Code
and pre-trained models at: https://github.com/SerezD/mi_ml_gen.