arxiv_cv 95% Match Research Paper AI researchers in generative models,Computer graphics professionals,Digital artists,Developers of image generation tools 2 weeks ago

DyPE: Dynamic Position Extrapolation for Ultra High Resolution Diffusion

generative-ai › diffusion

📄 Abstract

Abstract: Diffusion Transformer models can generate images with remarkable fidelity and detail, yet training them at ultra-high resolutions remains extremely costly due to the self-attention mechanism's quadratic scaling with the number of image tokens. In this paper, we introduce Dynamic Position Extrapolation (DyPE), a novel, training-free method that enables pre-trained diffusion transformers to synthesize images at resolutions far beyond their training data, with no additional sampling cost. DyPE takes advantage of the spectral progression inherent to the diffusion process, where low-frequency structures converge early, while high-frequencies take more steps to resolve. Specifically, DyPE dynamically adjusts the model's positional encoding at each diffusion step, matching their frequency spectrum with the current stage of the generative process. This approach allows us to generate images at resolutions that exceed the training resolution dramatically, e.g., 16 million pixels using FLUX. On multiple benchmarks, DyPE consistently improves performance and achieves state-of-the-art fidelity in ultra-high-resolution image generation, with gains becoming even more pronounced at higher resolutions. Project page is available at https://noamissachar.github.io/DyPE/.

Authors (6)

Noam Issachar

Guy Yariv

Sagie Benaim

Yossi Adi

Dani Lischinski

Raanan Fattal

Submitted

October 23, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Introduces Dynamic Position Extrapolation (DyPE), a novel, training-free method that allows pre-trained diffusion transformers to generate images at ultra-high resolutions far beyond their training data, without additional sampling cost. DyPE dynamically adjusts positional encodings based on the diffusion process stage.

Business Value

Democratizes the creation of ultra-high resolution imagery, enabling applications in fields requiring extreme detail, such as high-fidelity art, detailed scientific visualizations, and immersive virtual environments.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

High. Being training-free and having no additional sampling cost makes it highly practical for inference with existing pre-trained models.

Limitations Addressed

Extreme computational cost of training diffusion models at ultra-high resolutions,Quadratic scaling of self-attention with image tokens,Inability of pre-trained models to generate images at resolutions higher than training data

Performance Gains

Enables generation at resolutions far beyond training data (e.g., 16 million pixels),No additional sampling cost

Technical Tags

Diffusion ModelsUltra High ResolutionDynamic Position Extrapolation (DyPE)Diffusion TransformerTraining-FreePositional EncodingFrequency SpectrumImage GenerationScalabilitySelf-Attention

Research Topics

Generative ModelsImage SynthesisDeep LearningHigh-Resolution ImagingModel Efficiency

Methods & Architectures

Dynamic Position Extrapolation (DyPE)Training-free methodAdjusting positional encoding per diffusion stepLeveraging spectral progression Diffusion Transformer

Applications & Tasks

Image Generation Digital Art Content Creation High-Resolution Display Scientific Visualization Generating ultra-high resolution imagesOvercoming quadratic scaling of self-attentionEnabling generation beyond training resolution Synthesizing images at resolutions significantly higher than the model's training resolutionReducing the cost of ultra-high resolution image generation

Related Fields

Computer VisionGenerative AIDeep LearningImage ProcessingMachine Learning

Keywords

Diffusion ModelsUltra High ResolutionImage GenerationDyPEDiffusion TransformerTraining-FreePositional EncodingScalabilityGenerative AIHigh-FidelityFrequency SpectrumSelf-AttentionComputational Efficiency

Academic Context

#Generative Models#Image Synthesis#Deep Learning#High-Resolution Imaging#Model Efficiency

Commercial Potential

Potential Products

High-resolution image generation servicesPlugins for creative softwareTools for generating ultra-detailed textures and assets

Target Industries

Media and EntertainmentGamingAdvertisingArchitectureScientific VisualizationDigital Art

Use Case Examples

Generating photorealistic images at 8K or higher resolutionsCreating extremely detailed textures for 3D modelsVisualizing complex scientific data at unprecedented detail

Competitive Edge

Offers a unique training-free method to significantly extend the resolution capabilities of existing diffusion transformers, overcoming a major bottleneck in high-resolution image generation.

Market Opportunity

Massive and growing market for high-quality image generation.

Revenue Models

Integration into existing generative AI platformsAPI services.

Resource Requirements

Compute Needs

Low (inference-time, training-free)

Data Requirements

Requires pre-trained diffusion transformer models.

Deployment Constraints

Requires compatible pre-trained diffusion transformer models,Output file sizes can be very large

Scalability

Highly scalable in terms of output resolution due to its training-free nature.

Production Readiness

Maturity Level

Research

Time to Market

1-2 years

Patent Potential

Moderate (novel extrapolation technique)

View Full Paper Back to Papers