Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: The generation of high-quality, diverse, and prompt-aligned images is a
central goal in image-generating diffusion models. The popular classifier-free
guidance (CFG) approach improves quality and alignment at the cost of reduced
variation, creating an inherent entanglement of these effects. Recent work has
successfully disentangled these properties by guiding a model with a separately
trained, inferior counterpart; however, this solution introduces the
considerable overhead of requiring an auxiliary model. We challenge this
prerequisite by introducing In-situ Autoguidance, a method that elicits
guidance from the model itself without any auxiliary components. Our approach
dynamically generates an inferior prediction on the fly using a stochastic
forward pass, reframing guidance as a form of inference-time self-correction.
We demonstrate that this zero-cost approach is not only viable but also
establishes a powerful new baseline for cost-efficient guidance, proving that
the benefits of self-guidance can be achieved without external models.
Submitted
October 20, 2025
Key Contributions
Introduces 'In-situ Autoguidance', a novel method for diffusion models that elicits guidance from the model itself via a stochastic forward pass, eliminating the need for auxiliary models. This achieves cost-efficient guidance, improving image quality and prompt alignment without sacrificing diversity.
Business Value
Enables faster and more cost-effective generation of high-quality images, benefiting creative industries, content generation platforms, and applications requiring customized visual assets.