arxiv_cv 95% Match Research Paper AI Researchers,Generative AI Developers,Digital Artists,ML Engineers 2 weeks ago

CADE 2.5 - ZeResFDG: Frequency-Decoupled, Rescaled and Zero-Projected Guidance for SD/SDXL Latent Diffusion Models

computer-vision › diffusion-models

📄 Abstract

Abstract: We introduce CADE 2.5 (Comfy Adaptive Detail Enhancer), a sampler-level guidance stack for SD/SDXL latent diffusion models. The central module, ZeResFDG, unifies (i) frequency-decoupled guidance that reweights low- and high-frequency components of the guidance signal, (ii) energy rescaling that matches the per-sample magnitude of the guided prediction to the positive branch, and (iii) zero-projection that removes the component parallel to the unconditional direction. A lightweight spectral EMA with hysteresis switches between a conservative and a detail-seeking mode as structure crystallizes during sampling. Across SD/SDXL samplers, ZeResFDG improves sharpness, prompt adherence, and artifact control at moderate guidance scales without any retraining. In addition, we employ a training-free inference-time stabilizer, QSilk Micrograin Stabilizer (quantile clamp + depth/edge-gated micro-detail injection), which improves robustness and yields natural high-frequency micro-texture at high resolutions with negligible overhead. For completeness we note that the same rule is compatible with alternative parameterizations (e.g., velocity), which we briefly discuss in the Appendix; however, this paper focuses on SD/SDXL latent diffusion models.

Authors (1)

Denis Rychkovskiy
DZRobo, Independent Researcher

Institutions

🏛️ DZRobo, Independent Researcher

Submitted

October 14, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Introduces ZeResFDG, a sampler-level guidance stack for SD/SDXL models that unifies frequency-decoupled guidance, energy rescaling, and zero-projection. This method improves sharpness, prompt adherence, and artifact control without retraining, and includes a training-free stabilizer for robustness and micro-texture generation.

Business Value

Enhances the quality and controllability of AI-generated images, making diffusion models more practical for professional creative workflows, marketing, and personalized content generation.

Paper Metadata

Innovation Type

Algorithmic Improvement

Deployment Feasibility

High. It's a sampler-level modification that works without retraining, making it easily integrable into existing diffusion model pipelines and tools.

Limitations Addressed

Suboptimal image quality (sharpness, artifacts) in diffusion models,Poor prompt adherence,Lack of fine-grained control over generation details,Robustness issues during sampling

Performance Gains

Improved sharpness,Improved prompt adherence,Improved artifact control,Improved robustness,Natural high-frequency micro-texture

Technical Tags

diffusion modelsSDXLlatent diffusionguidancefrequency decouplingrescalingzero-projectionimage generationgenerative AIsampler

Research Topics

Generative ModelsDiffusion ModelsImage SynthesisDeep Learning OptimizationComputational Photography

Methods & Architectures

Frequency-Decoupled GuidanceEnergy RescalingZero-ProjectionSpectral EMA with HysteresisQSilk Micrograin StabilizerQuantile ClampDepth/Edge-Gated Micro-Detail Injection SDSDXLLatent Diffusion Models

Applications & Tasks

Digital Art Content Creation Image Editing Virtual Environments Image GenerationImage Quality EnhancementControllable Generation Image SynthesisImproving Diffusion Model SamplingEnhancing Prompt AdherenceReducing Artifacts

Related Fields

Generative AIDeep LearningComputer VisionImage ProcessingSignal Processing

Keywords

diffusion modelsSDXLlatent diffusionguidanceimage generationgenerative AIsamplerfrequencyrescalingzero-projectionartifact controlprompt adherence

Academic Context

#Generative Models#Diffusion Models#Image Synthesis#Deep Learning Optimization#Computational Photography

Technology Stack

Frameworks & Libraries

ComfyUI

Commercial Potential

Potential Products

Advanced image generation pluginsAI-powered creative toolsHigh-fidelity image synthesis services

Target Industries

Media and EntertainmentAdvertisingGamingDesign

Use Case Examples

Generating photorealistic images from text prompts with greater detailImproving the quality of AI-generated textures for gamesCreating unique digital art with enhanced control

Competitive Edge

Offers significant improvements in image quality and control for SD/SDXL models at the sampler level, without requiring model retraining, making it a highly practical enhancement over standard sampling techniques.

Market Opportunity

Rapidly growing market for generative AI tools and services.

Revenue Models

Integration into paid softwareAPI accessspecialized tool development.

Resource Requirements

Compute Needs

Requires GPU resources typical for running SD/SDXL models, but the guidance stack itself is lightweight.

Data Requirements

No specific dataset requirements for the guidance module itself, as it operates during inference.

Deployment Constraints

Compatibility with specific diffusion model architectures (SD/SDXL) and sampling frameworks.

Scalability

The guidance techniques are applied per-sample during inference, suggesting good scalability with respect to batch size.

Production Readiness

Maturity Level

Research/Development

Time to Market

6-12 months for integration into popular creative tools.

Patent Potential

Low to moderate, depending on the novelty of specific mathematical formulations within the guidance techniques.

View Full Paper Back to Papers