arxiv_cv 94% Match Research Paper AI researchers,Computer vision engineers,Digital artists,Content creators 5 days ago

SplitFlow: Flow Decomposition for Inversion-Free Text-to-Image Editing

generative-ai › flow-models

📄 Abstract

Abstract: Rectified flow models have become a de facto standard in image generation due to their stable sampling trajectories and high-fidelity outputs. Despite their strong generative capabilities, they face critical limitations in image editing tasks: inaccurate inversion processes for mapping real images back into the latent space, and gradient entanglement issues during editing often result in outputs that do not faithfully reflect the target prompt. Recent efforts have attempted to directly map source and target distributions via ODE-based approaches without inversion; however,these methods still yield suboptimal editing quality. In this work, we propose a flow decomposition-and-aggregation framework built upon an inversion-free formulation to address these limitations. Specifically, we semantically decompose the target prompt into multiple sub-prompts, compute an independent flow for each, and aggregate them to form a unified editing trajectory. While we empirically observe that decomposing the original flow enhances diversity in the target space, generating semantically aligned outputs still requires consistent guidance toward the full target prompt. To this end, we design a projection and soft-aggregation mechanism for flow, inspired by gradient conflict resolution in multi-task learning. This approach adaptively weights the sub-target velocity fields, suppressing semantic redundancy while emphasizing distinct directions, thereby preserving both diversity and consistency in the final edited output. Experimental results demonstrate that our method outperforms existing zero-shot editing approaches in terms of semantic fidelity and attribute disentanglement. The code is available at https://github.com/Harvard-AI-and-Robotics-Lab/SplitFlow.

Authors (6)

Sung-Hoon Yoon

Minghan Li

Gaspard Beaudouin

Congcong Wen

Muhammad Rafay Azhar

Mengyu Wang

Submitted

October 29, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

Proposes SplitFlow, a flow decomposition-and-aggregation framework for inversion-free text-to-image editing. This method semantically decomposes prompts, computes independent flows, and aggregates them to achieve faithful editing without problematic inversion or gradient entanglement issues.

Business Value

Enables more precise and intuitive control over image generation and editing, empowering artists and designers with powerful tools for creative expression and content creation.

Paper Metadata

Innovation Type

Novel Framework for Image Editing

Deployment Feasibility

Requires significant computational resources for training and inference, typical for large generative models.

Limitations Addressed

Inaccurate inversion processes in rectified flow models,Gradient entanglement issues during editing,Suboptimal editing quality in existing inversion-free methods

Performance Gains

Empirically observed to yield improved editing quality compared to existing methods.

Technical Tags

text-to-image editingrectified flow modelsinversion-freeflow decompositionprompt aggregationsemantic editinglatent space manipulationgradient entanglementgenerative modelsimage generation

Research Topics

Generative ModelsImage SynthesisComputer VisionNatural Language ProcessingMachine LearningAI Ethics

Methods & Architectures

Flow decompositionFlow aggregationInversion-free formulationSemantic prompt decompositionRectified flow models Rectified flow modelsFlow-based generative models

Applications & Tasks

Digital Art Content Creation Image Editing Creative AI Image EditingText-to-Image GenerationInversion AccuracyEditing Fidelity Editing existing images based on text promptsHigh-fidelity text-to-image manipulation

Related Fields

Generative AIComputer VisionNatural Language ProcessingDeep Learning

Keywords

text-to-imageimage editingrectified flowgenerative modelsinversion-freeflow decompositionprompt engineeringsemantic editingdeep learningcomputer vision

Academic Context

#Generative Models#Image Synthesis#Computer Vision#Natural Language Processing#Machine Learning#AI Ethics

Commercial Potential

Potential Products

Advanced image editing softwareAI-powered creative toolsPersonalized content generation platforms

Target Industries

Media and EntertainmentAdvertisingDesignGaming

Use Case Examples

Modifying specific objects or attributes in an image based on text instructionsGenerating variations of an image with precise textual controlCreating realistic image edits that maintain overall scene coherence

Competitive Edge

Addresses key limitations of existing text-to-image editing methods by offering a more robust and faithful approach through flow decomposition.

Market Opportunity

Large and growing market for generative AI tools in creative industries.

Revenue Models

Software licensingAPI accesssubscription services.

Resource Requirements

Compute Needs

High computational resources for training and inference, typical for large generative models.

Data Requirements

Requires large-scale image-text datasets for training.

Deployment Constraints

Computational cost,Model size

Production Readiness

Maturity Level

Research

Time to Market

Medium-term, for integration into creative software.

View Full Paper Back to Papers