arxiv_ai 95% Match Research Paper 3D Vision Researchers,Computer Graphics Engineers,Robotics Engineers,AR/VR Developers 1 week ago

ReconViaGen: Towards Accurate Multi-view 3D Object Reconstruction via Generation

computer-vision › 3d-vision

📄 Abstract

Abstract: Existing multi-view 3D object reconstruction methods heavily rely on sufficient overlap between input views, where occlusions and sparse coverage in practice frequently yield severe reconstruction incompleteness. Recent advancements in diffusion-based 3D generative techniques offer the potential to address these limitations by leveraging learned generative priors to hallucinate invisible parts of objects, thereby generating plausible 3D structures. However, the stochastic nature of the inference process limits the accuracy and reliability of generation results, preventing existing reconstruction frameworks from integrating such 3D generative priors. In this work, we comprehensively analyze the reasons why diffusion-based 3D generative methods fail to achieve high consistency, including (a) the insufficiency in constructing and leveraging cross-view connections when extracting multi-view image features as conditions, and (b) the poor controllability of iterative denoising during local detail generation, which easily leads to plausible but inconsistent fine geometric and texture details with inputs. Accordingly, we propose ReconViaGen to innovatively integrate reconstruction priors into the generative framework and devise several strategies that effectively address these issues. Extensive experiments demonstrate that our ReconViaGen can reconstruct complete and accurate 3D models consistent with input views in both global structure and local details.Project page: https://jiahao620.github.io/reconviagen.

Authors (9)

Jiahao Chang

Chongjie Ye

Yushuang Wu

Yuantao Chen

Yidan Zhang

Zhongjin Luo

+3 more

Submitted

October 27, 2025

arXiv Category

cs.CV

arXiv PDF

Key Contributions

ReconViaGen addresses limitations in multi-view 3D reconstruction by integrating diffusion-based generative models. It improves cross-view connection utilization and controllability of denoising to leverage generative priors for hallucinating occluded parts, leading to more complete and accurate reconstructions.

Business Value

Enables creation of more complete and accurate 3D models from limited or occluded views, valuable for AR/VR content creation, robotics, and digital twins.

Paper Metadata

Innovation Type

Algorithmic Improvement

Deployment Feasibility

Moderate, requires significant computational resources for diffusion models and integration with existing 3D pipelines.

Limitations Addressed

Severe reconstruction incompleteness caused by occlusions and sparse coverage in multi-view 3D reconstruction, and the accuracy/reliability issues of existing diffusion-based generative methods for this task.

Technical Tags

3D object reconstructionmulti-view geometrydiffusion modelsgenerative priorsocclusion handlingsparse coveragehallucinationcross-view connectionsfeature extractioniterative denoising

Research Topics

3D Computer VisionGenerative ModelsMulti-view ReconstructionDiffusion ModelsComputer Graphics

Methods & Architectures

ReconViaGenDiffusion-based 3D generative techniquesMulti-view feature extractionIterative denoising Diffusion ModelsGenerative Models

Applications & Tasks

Computer Graphics Robotics Augmented Reality Virtual Reality 3D Modeling Reconstruction incompleteness due to occlusion and sparse viewsStochastic nature of generative inferencePoor controllability of iterative denoisingInsufficient cross-view connections 3D Object ReconstructionHandling OcclusionsImproving Reconstruction AccuracyLeveraging Generative Priors

Related Fields

Computer VisionComputer GraphicsMachine Learning3D ModelingGenerative AI

Keywords

3D reconstructionmulti-viewdiffusion modelsgenerative modelsocclusionsparse viewsReconViaGencomputer vision3D objecthallucination

Academic Context

#3D Computer Vision#Generative Models#Multi-view Reconstruction#Diffusion Models#Computer Graphics

Commercial Potential

Potential Products

3D scanning softwareAR/VR content creation toolsRobotic perception systems

Target Industries

GamingEntertainmentManufacturingArchitectureRobotics

Use Case Examples

Reconstructing 3D objects from phone camera capturesGenerating complete 3D models of real-world scenesImproving robotic grasping by generating full object models

Competitive Edge

Improves upon existing multi-view reconstruction methods by effectively integrating generative priors from diffusion models, overcoming limitations of pure geometric approaches.

Resource Requirements

Compute Needs

High, especially for training and inference with diffusion models.

Data Requirements

Multi-view images of 3D objects.

Deployment Constraints

Requires significant computational power and potentially specialized hardware for real-time applications.

Scalability

Scalability depends on the efficiency of the diffusion model inference and the complexity of the scene.

Production Readiness

Maturity Level

Research

View Full Paper Back to Papers