Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Classifier-Free Guidance (CFG), which combines the conditional and
unconditional score functions with two coefficients summing to one, serves as a
practical technique for diffusion model sampling. Theoretically, however,
denoising with CFG \textit{cannot} be expressed as a reciprocal diffusion
process, which may consequently leave some hidden risks during use. In this
work, we revisit the theory behind CFG and rigorously confirm that the improper
configuration of the combination coefficients (\textit{i.e.}, the widely used
summing-to-one version) brings about expectation shift of the generative
distribution. To rectify this issue, we propose ReCFG with a relaxation on the
guidance coefficients such that denoising with \method strictly aligns with the
diffusion theory. We further show that our approach enjoys a
\textbf{\textit{closed-form}} solution given the guidance strength. That way,
the rectified coefficients can be readily pre-computed via traversing the
observed data, leaving the sampling speed barely affected. Empirical evidence
on real-world data demonstrate the compatibility of our post-hoc design with
existing state-of-the-art diffusion models, including both class-conditioned
ones (\textit{e.g.}, EDM2 on ImageNet) and text-conditioned ones
(\textit{e.g.}, SD3 on CC12M), without any retraining. Code is available at
\href{https://github.com/thuxmf/recfg}{https://github.com/thuxmf/recfg}.
Key Contributions
Proposes Rectified Classifier-Free Guidance (ReCFG) for diffusion models, which corrects the theoretical misalignment of standard CFG by relaxing guidance coefficients. This ensures denoising strictly aligns with diffusion theory and offers a closed-form solution for coefficients.
Business Value
Leads to more theoretically sound and potentially more controllable conditional generation from diffusion models, improving the reliability and quality of generated content for various applications.