Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 90% Match Research Paper Autonomous Driving Engineers,Robotics Researchers,Machine Learning Engineers 1 week ago

GaussianFusion: Gaussian-Based Multi-Sensor Fusion for End-to-End Autonomous Driving

robotics › navigation
📄 Abstract

Abstract: Multi-sensor fusion is crucial for improving the performance and robustness of end-to-end autonomous driving systems. Existing methods predominantly adopt either attention-based flatten fusion or bird's eye view fusion through geometric transformations. However, these approaches often suffer from limited interpretability or dense computational overhead. In this paper, we introduce GaussianFusion, a Gaussian-based multi-sensor fusion framework for end-to-end autonomous driving. Our method employs intuitive and compact Gaussian representations as intermediate carriers to aggregate information from diverse sensors. Specifically, we initialize a set of 2D Gaussians uniformly across the driving scene, where each Gaussian is parameterized by physical attributes and equipped with explicit and implicit features. These Gaussians are progressively refined by integrating multi-modal features. The explicit features capture rich semantic and spatial information about the traffic scene, while the implicit features provide complementary cues beneficial for trajectory planning. To fully exploit rich spatial and semantic information in Gaussians, we design a cascade planning head that iteratively refines trajectory predictions through interactions with Gaussians. Extensive experiments on the NAVSIM and Bench2Drive benchmarks demonstrate the effectiveness and robustness of the proposed GaussianFusion framework. The source code will be released at https://github.com/Say2L/GaussianFusion.
Authors (5)
Shuai Liu
Quanmin Liang
Zefeng Li
Boyang Li
Kai Huang
Submitted
May 27, 2025
arXiv Category
cs.RO
arXiv PDF

Key Contributions

GaussianFusion introduces a novel Gaussian-based multi-sensor fusion framework for end-to-end autonomous driving. It utilizes compact Gaussian representations as intermediate carriers to aggregate information from diverse sensors, offering improved interpretability and reduced computational overhead compared to existing attention-based or bird's eye view fusion methods. This approach allows for progressive refinement of Gaussians with explicit and implicit features, capturing rich semantic and spatial information for better scene understanding.

Business Value

Enhances the safety and reliability of autonomous driving systems by providing a more efficient and interpretable way to fuse data from multiple sensors, potentially reducing development costs and improving performance in complex driving scenarios.