Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Scene coordinate regression (SCR) models have proven to be powerful implicit
scene representations for 3D vision, enabling visual relocalization and
structure-from-motion. SCR models are trained specifically for one scene. If
training images imply insufficient multi-view constraints SCR models
degenerate. We present a probabilistic reinterpretation of training SCR models,
which allows us to infuse high-level reconstruction priors. We investigate
multiple such priors, ranging from simple priors over the distribution of
reconstructed depth values to learned priors over plausible scene coordinate
configurations. For the latter, we train a 3D point cloud diffusion model on a
large corpus of indoor scans. Our priors push predicted 3D scene points towards
plausible geometry at each training step to increase their likelihood. On three
indoor datasets our priors help learning better scene representations,
resulting in more coherent scene point clouds, higher registration rates and
better camera poses, with a positive effect on down-stream tasks such as novel
view synthesis and camera relocalization.
Key Contributions
This paper introduces a probabilistic framework for training Scene Coordinate Regression (SCR) models by incorporating high-level reconstruction priors, specifically learned priors from a 3D point cloud diffusion model. This approach addresses the degeneracy of SCR models with insufficient multi-view constraints, leading to more coherent 3D scene point clouds and improved registration rates.
Business Value
Improves the accuracy and reliability of 3D scene reconstruction from images, which is critical for applications like AR/VR content creation, robotic navigation, and digital twins.