Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Free-moving object reconstruction from monocular video remains challenging,
particularly without reliable pose or depth cues and under arbitrary object
motion. We introduce OnlineSplatter, a novel online feed-forward framework
generating high-quality, object-centric 3D Gaussians directly from RGB frames
without requiring camera pose, depth priors, or bundle optimization. Our
approach anchors reconstruction using the first frame and progressively refines
the object representation through a dense Gaussian primitive field, maintaining
constant computational cost regardless of video sequence length. Our core
contribution is a dual-key memory module combining latent appearance-geometry
keys with explicit directional keys, robustly fusing current frame features
with temporally aggregated object states. This design enables effective
handling of free-moving objects via spatial-guided memory readout and an
efficient sparsification mechanism, ensuring comprehensive yet compact object
coverage. Evaluations on real-world datasets demonstrate that OnlineSplatter
significantly outperforms state-of-the-art pose-free reconstruction baselines,
consistently improving with more observations while maintaining constant memory
and runtime.
Authors (5)
Mark He Huang
Lin Geng Foo
Christian Theobalt
Ying Sun
De Wen Soh
Submitted
October 23, 2025
Key Contributions
Introduces OnlineSplatter, an online feed-forward framework for pose-free 3D reconstruction of free-moving objects from monocular video using 3D Gaussians. It employs a novel dual-key memory module to fuse appearance and geometry information robustly.
Business Value
Enables real-time 3D scanning and reconstruction of dynamic objects using readily available monocular cameras, opening up applications in AR/VR, robotics, and content creation.