Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Monocular 3D lane detection aims to estimate the 3D position of lanes from
frontal-view (FV) images. However, existing methods are fundamentally
constrained by the inherent ambiguity of single-frame input, which leads to
inaccurate geometric predictions and poor lane integrity, especially for
distant lanes.To overcome this, we propose to unlock the rich information
embedded in the temporal evolution of the scene as the vehicle moves. Our
proposed Geometry-aware Temporal Aggregation Network (GTA-Net) systematically
leverages the temporal information from complementary perspectives.First,
Temporal Geometry Enhancement Module (TGEM) learns geometric consistency across
consecutive frames, effectively recovering depth information from motion to
build a reliable 3D scene representation.Second, to enhance lane integrity,
Temporal Instance-aware Query Generation (TIQG) module aggregates instance cues
from past and present frames. Crucially, for lanes that are ambiguous in the
current view, TIQG innovatively synthesizes a pseudo future perspective to
generate queries that reveal lanes which would otherwise be missed.The
experiments demonstrate that GTA-Net achieves new SoTA results, significantly
outperforming existing monocular 3D lane detection solutions.
Key Contributions
Proposes GTA-Net, a novel network that leverages temporal evolution from consecutive frames to overcome monocular ambiguity in 3D lane detection. It uses a Temporal Geometry Enhancement Module (TGEM) for reliable 3D scene representation and a Temporal Instance-aware Query Generation (TIQG) module for enhanced lane integrity.
Business Value
Enhances the safety and reliability of autonomous driving systems by providing more accurate and robust 3D lane detection, crucial for navigation and path planning, especially in challenging conditions or for distant objects.