Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 95% Match Research Paper Researchers in generative models,Computer vision engineers,Image processing specialists 1 week ago

Kernel Density Steering: Inference-Time Scaling via Mode Seeking for Image Restoration

computer-vision › diffusion-models
📄 Abstract

Abstract: Diffusion models show promise for image restoration, but existing methods often struggle with inconsistent fidelity and undesirable artifacts. To address this, we introduce Kernel Density Steering (KDS), a novel inference-time framework promoting robust, high-fidelity outputs through explicit local mode-seeking. KDS employs an $N$-particle ensemble of diffusion samples, computing patch-wise kernel density estimation gradients from their collective outputs. These gradients steer patches in each particle towards shared, higher-density regions identified within the ensemble. This collective local mode-seeking mechanism, acting as "collective wisdom", steers samples away from spurious modes prone to artifacts, arising from independent sampling or model imperfections, and towards more robust, high-fidelity structures. This allows us to obtain better quality samples at the expense of higher compute by simultaneously sampling multiple particles. As a plug-and-play framework, KDS requires no retraining or external verifiers, seamlessly integrating with various diffusion samplers. Extensive numerical validations demonstrate KDS substantially improves both quantitative and qualitative performance on challenging real-world super-resolution and image inpainting tasks.
Authors (6)
Yuyang Hu
Kangfu Mei
Mojtaba Sahraee-Ardakan
Ulugbek S. Kamilov
Peyman Milanfar
Mauricio Delbracio
Submitted
July 8, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Kernel Density Steering (KDS) is a novel inference-time framework for diffusion models that enhances image restoration by using an N-particle ensemble and patch-wise KDE gradients to steer samples towards higher-density regions. This 'collective wisdom' mechanism reduces artifacts and improves fidelity by avoiding spurious modes.

Business Value

Enables the creation of higher-quality restored images, valuable in fields like digital archiving, medical imaging enhancement, and professional photography.