Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Foundation models are revolutionizing autonomous driving perception,
transitioning the field from narrow, task-specific deep learning models to
versatile, general-purpose architectures trained on vast, diverse datasets.
This survey examines how these models address critical challenges in autonomous
perception, including limitations in generalization, scalability, and
robustness to distributional shifts. The survey introduces a novel taxonomy
structured around four essential capabilities for robust performance in dynamic
driving environments: generalized knowledge, spatial understanding,
multi-sensor robustness, and temporal reasoning. For each capability, the
survey elucidates its significance and comprehensively reviews cutting-edge
approaches. Diverging from traditional method-centric surveys, our unique
framework prioritizes conceptual design principles, providing a
capability-driven guide for model development and clearer insights into
foundational aspects. We conclude by discussing key challenges, particularly
those associated with the integration of these capabilities into real-time,
scalable systems, and broader deployment challenges related to computational
demands and ensuring model reliability against issues like hallucinations and
out-of-distribution failures. The survey also outlines crucial future research
directions to enable the safe and effective deployment of foundation models in
autonomous driving systems.