Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 98% Match Research Paper 3D Artists,Game Developers,AR/VR Content Creators,Computer Vision Researchers,Machine Learning Engineers 1 day ago

Wonder3D++: Cross-domain Diffusion for High-fidelity 3D Generation from a Single Image

generative-ai › diffusion
📄 Abstract

Abstract: In this work, we introduce \textbf{Wonder3D++}, a novel method for efficiently generating high-fidelity textured meshes from single-view images. Recent methods based on Score Distillation Sampling (SDS) have shown the potential to recover 3D geometry from 2D diffusion priors, but they typically suffer from time-consuming per-shape optimization and inconsistent geometry. In contrast, certain works directly produce 3D information via fast network inferences, but their results are often of low quality and lack geometric details. To holistically improve the quality, consistency, and efficiency of single-view reconstruction tasks, we propose a cross-domain diffusion model that generates multi-view normal maps and the corresponding color images. To ensure the consistency of generation, we employ a multi-view cross-domain attention mechanism that facilitates information exchange across views and modalities. Lastly, we introduce a cascaded 3D mesh extraction algorithm that drives high-quality surfaces from the multi-view 2D representations in only about $3$ minute in a coarse-to-fine manner. Our extensive evaluations demonstrate that our method achieves high-quality reconstruction results, robust generalization, and good efficiency compared to prior works. Code available at https://github.com/xxlong0/Wonder3D/tree/Wonder3D_Plus.
Authors (10)
Yuxiao Yang
Xiao-Xiao Long
Zhiyang Dou
Cheng Lin
Yuan Liu
Qingsong Yan
+4 more
Submitted
November 3, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

Wonder3D++ proposes a cross-domain diffusion model for high-fidelity 3D mesh generation from single images, addressing issues of slow optimization and inconsistent geometry. It utilizes multi-view normal maps, cross-domain attention for consistency, and a cascaded mesh extraction algorithm for efficient and detailed 3D output.

Business Value

Enables rapid and high-quality creation of 3D assets from readily available single images, significantly reducing the cost and time for 3D modeling in industries like gaming, AR/VR, and e-commerce.