Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Real-time, high-fidelity reconstruction of dynamic driving scenes is
challenged by complex dynamics and sparse views, with prior methods struggling
to balance quality and efficiency. We propose DrivingScene, an online,
feed-forward framework that reconstructs 4D dynamic scenes from only two
consecutive surround-view images. Our key innovation is a lightweight residual
flow network that predicts the non-rigid motion of dynamic objects per camera
on top of a learned static scene prior, explicitly modeling dynamics via scene
flow. We also introduce a coarse-to-fine training paradigm that circumvents the
instabilities common to end-to-end approaches. Experiments on nuScenes dataset
show our image-only method simultaneously generates high-quality depth, scene
flow, and 3D Gaussian point clouds online, significantly outperforming
state-of-the-art methods in both dynamic reconstruction and novel view
synthesis.
Authors (6)
Qirui Hou
Wenzhang Sun
Chang Zeng
Chunfeng Wang
Hao Li
Jianxun Cui
Submitted
October 14, 2025
Key Contributions
DrivingScene is a novel online, feed-forward framework for real-time, high-fidelity reconstruction of dynamic driving scenes using only two consecutive surround-view images. It introduces a lightweight residual flow network to model non-rigid motion and a coarse-to-fine training paradigm, achieving state-of-the-art performance in dynamic reconstruction and novel view synthesis.
Business Value
Enables real-time, high-fidelity 3D scene understanding for autonomous vehicles, advanced driver-assistance systems (ADAS), and immersive AR/VR experiences.