arxiv_cv 95% Match Research Paper AI Researchers,Machine Learning Engineers,Computer Vision Practitioners 2 weeks ago

RODS: Robust Optimization Inspired Diffusion Sampling for Detecting and Reducing Hallucination in Generative Models

generative-ai › diffusion

📄 Abstract

Abstract: Diffusion models have achieved state-of-the-art performance in generative modeling, yet their sampling procedures remain vulnerable to hallucinations-often stemming from inaccuracies in score approximation. In this work, we reinterpret diffusion sampling through the lens of optimization and introduce RODS (Robust Optimization-inspired Diffusion Sampler), a novel method that detects and corrects high-risk sampling steps using geometric cues from the loss landscape. RODS enforces smoother sampling trajectories and adaptively adjusts perturbations, reducing hallucinations without retraining and at minimal additional inference cost. Experiments on AFHQv2, FFHQ, and 11k-hands demonstrate that RODS maintains comparable image quality and preserves generation diversity. More importantly, it improves both sampling fidelity and robustness, detecting over 70% of hallucinated samples and correcting more than 25%, all while avoiding the introduction of new artifacts. We release our code at https://github.com/Yiqi-Verna-Tian/RODS.

Authors (6)

Yiqi Tian

Pengfei Jin

Mingze Yuan

Na Li

Bo Zeng

Quanzheng Li

Submitted

July 16, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Introduces RODS, a novel diffusion sampling method inspired by robust optimization, which detects and corrects high-risk sampling steps using geometric cues from the loss landscape. This method reduces hallucinations without retraining and at minimal inference cost, improving sampling fidelity and robustness.

Business Value

Enhances the reliability and quality of generated images, which is crucial for applications in creative industries, synthetic data generation, and content creation.

Paper Metadata

Innovation Type

Algorithmic Improvement

Deployment Feasibility

High, as it operates during inference with minimal additional cost and no retraining required.

Limitations Addressed

Vulnerability of diffusion model sampling procedures to hallucinations stemming from inaccuracies in score approximation.

Performance Gains

Detects over 70% of hallucinated samples and corrects more than 25% without introducing new artifacts.

Technical Tags

diffusion modelssamplingoptimizationscore approximationgeometric cuesloss landscapegenerative modelingimage generationhallucination detectionrobustness

Research Topics

Generative AIDiffusion ModelsModel SamplingRobustness in AIImage Generation Quality

Methods & Architectures

Robust Optimization-inspired Diffusion Sampler (RODS)geometric cues from loss landscapeadaptive perturbation adjustment Diffusion Models

Applications & Tasks

Image Generation Hallucination in generative modelsInaccurate score approximationVulnerable sampling procedures Detecting and reducing hallucinationsImproving sampling fidelity and robustness

Datasets & Benchmarks

Datasets

AFHQv2, FFHQ, 11k-hands

image qualitygeneration diversitysampling fidelityrobustnesshallucination detection ratehallucination correction rate

Related Fields

Machine LearningComputer VisionDeep LearningOptimization

Keywords

diffusion modelsgenerative modelshallucinationsamplingoptimizationscore approximationimage generationrobustnessgeometric cuesloss landscapeAFHQv2FFHQ11k-hands

Academic Context

#Generative AI#Diffusion Models#Model Sampling#Robustness in AI#Image Generation Quality

Commercial Potential

Potential Products

Improved image generation softwareTools for detecting AI-generated content

Target Industries

Media and EntertainmentAdvertisingGamingSynthetic Data Generation

Use Case Examples

Generating high-fidelity images with reduced artifactsEnsuring robustness in AI-generated content

Competitive Edge

Offers a post-hoc improvement to existing diffusion models, addressing hallucination issues without the need for retraining, which is a significant advantage over methods requiring model updates.

Market Opportunity

Growing market for generative AI tools and services.

Revenue Models

Licensing of the technologyintegration into existing generative AI platforms.

Resource Requirements

Compute Needs

Minimal additional inference cost.

Data Requirements

Requires datasets for training/evaluation of diffusion models (e.g., AFHQv2, FFHQ, 11k-hands).

Scalability

Scales with the inference cost of the underlying diffusion model.

Production Readiness

Maturity Level

Research

Time to Market

1-2 years

Patent Potential

Moderate, for the novel RODS algorithm.

View Full Paper Back to Papers