Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 90% Match Research Paper AI researchers in generative models,3D artists,Game developers,AR/VR content creators 1 week ago

ORIGEN: Zero-Shot 3D Orientation Grounding in Text-to-Image Generation

generative-ai › diffusion
📄 Abstract

Abstract: We introduce ORIGEN, the first zero-shot method for 3D orientation grounding in text-to-image generation across multiple objects and diverse categories. While previous work on spatial grounding in image generation has mainly focused on 2D positioning, it lacks control over 3D orientation. To address this, we propose a reward-guided sampling approach using a pretrained discriminative model for 3D orientation estimation and a one-step text-to-image generative flow model. While gradient-ascent-based optimization is a natural choice for reward-based guidance, it struggles to maintain image realism. Instead, we adopt a sampling-based approach using Langevin dynamics, which extends gradient ascent by simply injecting random noise--requiring just a single additional line of code. Additionally, we introduce adaptive time rescaling based on the reward function to accelerate convergence. Our experiments show that ORIGEN outperforms both training-based and test-time guidance methods across quantitative metrics and user studies.
Authors (5)
Yunhong Min
Daehyeon Choi
Kyeongmin Yeo
Jihyun Lee
Minhyuk Sung
Submitted
March 28, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Introduces ORIGEN, the first zero-shot method for 3D orientation grounding in text-to-image generation. It uses reward-guided sampling with Langevin dynamics and adaptive time rescaling to control object orientation without explicit training for each object/category.

Business Value

Enables more precise and controllable generation of 3D assets from text descriptions, significantly benefiting industries like game development, VR/AR content creation, and product design by automating and refining the asset creation process.