Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Reconstructing complete and interactive 3D scenes remains a fundamental
challenge in computer vision and robotics, particularly due to persistent
object occlusions and limited sensor coverage. Multiview observations from a
single scene scan often fail to capture the full structural details. Existing
approaches typically rely on multi stage pipelines, such as segmentation,
background completion, and inpainting or require per-object dense scanning,
both of which are error-prone, and not easily scalable. We propose IGFuse, a
novel framework that reconstructs interactive Gaussian scene by fusing
observations from multiple scans, where natural object rearrangement between
captures reveal previously occluded regions. Our method constructs segmentation
aware Gaussian fields and enforces bi-directional photometric and semantic
consistency across scans. To handle spatial misalignments, we introduce a
pseudo-intermediate scene state for unified alignment, alongside collaborative
co-pruning strategies to refine geometry. IGFuse enables high fidelity
rendering and object level scene manipulation without dense observations or
complex pipelines. Extensive experiments validate the framework's strong
generalization to novel scene configurations, demonstrating its effectiveness
for real world 3D reconstruction and real-to-simulation transfer. Our project
page is available online.