Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Recent text-to-image models have revolutionized image generation, but they
still struggle with maintaining concept consistency across generated images.
While existing works focus on character consistency, they often overlook the
crucial role of scenes in storytelling, which restricts their creativity in
practice. This paper introduces scene-oriented story generation, addressing two
key challenges: (i) scene planning, where current methods fail to ensure
scene-level narrative coherence by relying solely on text descriptions, and
(ii) scene consistency, which remains largely unexplored in terms of
maintaining scene consistency across multiple stories. We propose
SceneDecorator, a training-free framework that employs VLM-Guided Scene
Planning to ensure narrative coherence across different scenes in a
``global-to-local'' manner, and Long-Term Scene-Sharing Attention to maintain
long-term scene consistency and subject diversity across generated stories.
Extensive experiments demonstrate the superior performance of SceneDecorator,
highlighting its potential to unleash creativity in the fields of arts, films,
and games.
Authors (8)
Quanjian Song
Donghao Zhou
Jingyu Lin
Fei Shen
Jiaze Wang
Xiaowei Hu
+2 more
Submitted
October 27, 2025
Key Contributions
SceneDecorator introduces scene-oriented story generation, addressing scene planning and scene consistency challenges overlooked by prior character-focused methods. It employs VLM-Guided Scene Planning for narrative coherence and Long-Term Scene-Sharing Attention for consistent scenes and subject diversity across stories, enabling more creative and coherent visual storytelling.
Business Value
Empowers creators to generate more coherent and visually consistent stories, streamlining the process of creating narrative content for various media platforms.