Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 80% Match Research Paper Machine learning researchers,Data scientists,Developers of generative models,Statisticians 1 week ago

Statistical Inference for Generative Model Comparison

generative-ai › gans
📄 Abstract

Abstract: Generative models have achieved remarkable success across a range of applications, yet their evaluation still lacks principled uncertainty quantification. In this paper, we develop a method for comparing how close different generative models are to the underlying distribution of test samples. Particularly, our approach employs the Kullback-Leibler (KL) divergence to measure the distance between a generative model and the unknown test distribution, as KL requires no tuning parameters such as the kernels used by RKHS-based distances, and is the only $f$-divergence that admits a crucial cancellation to enable the uncertainty quantification. Furthermore, we extend our method to comparing conditional generative models and leverage Edgeworth expansions to address limited-data settings. On simulated datasets with known ground truth, we show that our approach realizes effective coverage rates, and has higher power compared to kernel-based methods. When applied to generative models on image and text datasets, our procedure yields conclusions consistent with benchmark metrics but with statistical confidence.
Authors (3)
Zijun Gao
Yan Sun
Han Su
Submitted
January 31, 2025
arXiv Category
stat.ML
arXiv PDF

Key Contributions

This paper develops a principled method for comparing generative models using KL divergence, enabling uncertainty quantification. It extends the method to conditional models and uses Edgeworth expansions for limited-data settings, demonstrating higher power and effective coverage rates compared to kernel-based methods.

Business Value

Provides a more reliable and statistically sound way to evaluate and compare generative models, crucial for selecting the best models for applications and understanding their limitations.