Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Text-to-image diffusion models have demonstrated significant capabilities to
generate diverse and detailed visuals in various domains, and story
visualization is emerging as a particularly promising application. However, as
their use in real-world creative domains increases, the need for providing
enhanced control, refinement, and the ability to modify images post-generation
in a consistent manner becomes an important challenge. Existing methods often
lack the flexibility to apply fine or coarse edits while maintaining visual and
narrative consistency across multiple frames, preventing creators from
seamlessly crafting and refining their visual stories. To address these
challenges, we introduce Plot'n Polish, a zero-shot framework that enables
consistent story generation and provides fine-grained control over story
visualizations at various levels of detail.