Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: State-of-the-art visual generative AI tools hold immense potential to assist
users in the early ideation stages of creative tasks -- offering the ability to
generate (rather than search for) novel and unprecedented (instead of existing)
images of considerable quality that also adhere to boundless combinations of
user specifications. However, many large-scale text-to-image systems are
designed for broad applicability, yielding conventional output that may limit
creative exploration. They also employ interaction methods that may be
difficult for beginners. Given that creative end users often operate in
diverse, context-specific ways that are often unpredictable, more variation and
personalization are necessary. We introduce POET, a real-time interactive tool
that (1) automatically discovers dimensions of homogeneity in text-to-image
generative models, (2) expands these dimensions to diversify the output space
of generated images, and (3) learns from user feedback to personalize
expansions. An evaluation with 28 users spanning four creative task domains
demonstrated POET's ability to generate results with higher perceived diversity
and help users reach satisfaction in fewer prompts during creative tasks,
thereby prompting them to deliberate and reflect more on a wider range of
possible produced results during the co-creative process. Focusing on visual
creativity, POET offers a first glimpse of how interaction techniques of future
text-to-image generation tools may support and align with more pluralistic
values and the needs of end users during the ideation stages of their work.