Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 97% Match Research Paper Game Developers,3D Artists,VR/AR Developers,Computer Graphics Researchers 2 weeks ago

Imaginarium: Vision-guided High-Quality 3D Scene Layout Generation

computer-vision › 3d-vision
📄 Abstract

Abstract: Generating artistic and coherent 3D scene layouts is crucial in digital content creation. Traditional optimization-based methods are often constrained by cumbersome manual rules, while deep generative models face challenges in producing content with richness and diversity. Furthermore, approaches that utilize large language models frequently lack robustness and fail to accurately capture complex spatial relationships. To address these challenges, this paper presents a novel vision-guided 3D layout generation system. We first construct a high-quality asset library containing 2,037 scene assets and 147 3D scene layouts. Subsequently, we employ an image generation model to expand prompt representations into images, fine-tuning it to align with our asset library. We then develop a robust image parsing module to recover the 3D layout of scenes based on visual semantics and geometric information. Finally, we optimize the scene layout using scene graphs and overall visual semantics to ensure logical coherence and alignment with the images. Extensive user testing demonstrates that our algorithm significantly outperforms existing methods in terms of layout richness and quality. The code and dataset will be available at https://github.com/HiHiAllen/Imaginarium.
Authors (11)
Xiaoming Zhu
Xu Huang
Qinghongbing Xie
Zhi Deng
Junsheng Yu
Yirui Guan
+5 more
Submitted
October 17, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Imaginarium is a novel vision-guided system for generating high-quality 3D scene layouts. It leverages a curated asset library, fine-tuned image generation models, and robust image parsing to create coherent scenes, overcoming limitations of traditional methods and LLM-based approaches in capturing spatial relationships.

Business Value

Accelerates the creation of complex 3D environments for games, VR/AR experiences, and visual effects, reducing production time and costs for digital content creators.