arxiv_ai 90% Match Research Paper AI Researchers,Autonomous Driving Engineers,Robotics Engineers,ML Engineers 1 week ago

Raw2Drive: Reinforcement Learning with Aligned World Models for End-to-End Autonomous Driving (in CARLA v2)

reinforcement-learning › robotics-rl

📄 Abstract

Abstract: Reinforcement Learning (RL) can mitigate the causal confusion and distribution shift inherent to imitation learning (IL). However, applying RL to end-to-end autonomous driving (E2E-AD) remains an open problem for its training difficulty, and IL is still the mainstream paradigm in both academia and industry. Recently Model-based Reinforcement Learning (MBRL) have demonstrated promising results in neural planning; however, these methods typically require privileged information as input rather than raw sensor data. We fill this gap by designing Raw2Drive, a dual-stream MBRL approach. Initially, we efficiently train an auxiliary privileged world model paired with a neural planner that uses privileged information as input. Subsequently, we introduce a raw sensor world model trained via our proposed Guidance Mechanism, which ensures consistency between the raw sensor world model and the privileged world model during rollouts. Finally, the raw sensor world model combines the prior knowledge embedded in the heads of the privileged world model to effectively guide the training of the raw sensor policy. Raw2Drive is so far the only RL based end-to-end method on CARLA Leaderboard 2.0, and Bench2Drive and it achieves state-of-the-art performance.

Authors (6)

Zhenjie Yang

Xiaosong Jia

Qifeng Li

Xue Yang

Maoqing Yao

Junchi Yan

Submitted

May 22, 2025

arXiv Category

cs.RO

arXiv PDF

Key Contributions

Raw2Drive proposes a dual-stream MBRL approach for end-to-end autonomous driving that effectively trains a raw sensor world model using privileged information. It employs an auxiliary privileged world model and a neural planner, and introduces a Guidance Mechanism to ensure consistency between the raw and privileged world models during rollouts.

Business Value

Paves the way for more robust and adaptable autonomous driving systems by leveraging RL and world models trained on realistic sensor inputs, potentially improving safety and performance.

Paper Metadata

Innovation Type

New Model Architecture/Approach

Deployment Feasibility

Moderate, requires significant simulation environments (like CARLA) and careful integration of the dual-stream architecture.

Limitations Addressed

Difficulty in training RL for end-to-end autonomous driving,IL's limitations (causal confusion, distribution shift),MBRL methods typically requiring privileged information, not raw sensor data

Performance Gains

Enables training of RL for E2E-AD using raw sensor data, overcoming limitations of IL and traditional MBRL methods.

Technical Tags

end-to-end autonomous drivingreinforcement learning (RL)imitation learning (IL)model-based RL (MBRL)world modelsprivileged informationraw sensor dataCARLA v2guidance mechanismneural planner

Research Topics

Autonomous DrivingReinforcement LearningModel-Based RLImitation LearningRoboticsPerception

Methods & Architectures

Raw2Drive (dual-stream MBRL)Auxiliary privileged world modelNeural plannerGuidance MechanismTraining via privileged information Dual-stream MBRLWorld ModelsNeural Planner

Applications & Tasks

Autonomous Driving Robotics Simulation Training RL for end-to-end autonomous drivingBridging the gap between privileged and raw sensor dataMitigating causal confusion and distribution shiftImproving neural planning End-to-end autonomous drivingLearning from raw sensor inputsNeural planning in driving scenarios

Datasets & Benchmarks

Datasets

CARLA v2

Driving performance metrics (e.g., safety, efficiency, adherence to rules)

Related Fields

Autonomous DrivingReinforcement LearningRoboticsComputer VisionMachine Learning

Keywords

autonomous drivingreinforcement learningmodel-based RLworld modelsimitation learningprivileged informationraw sensor dataCARLAneural plannerdual-streamMBRL

Academic Context

#Autonomous Driving#Reinforcement Learning#Model-Based RL#Imitation Learning#Robotics#Perception

Commercial Potential

Potential Products

End-to-end autonomous driving systemsAdvanced driver-assistance systems (ADAS)Robotic control systems for vehicles

Target Industries

AutomotiveTransportationLogisticsRobotics

Use Case Examples

Developing self-driving car software that learns directly from camera and lidar data.Creating more robust planning modules for autonomous vehicles.Simulating and training driving policies in complex scenarios.

Competitive Edge

Addresses a key challenge in applying MBRL to autonomous driving by enabling training with raw sensor data, offering a potential advantage over IL and other MBRL methods.

Market Opportunity

Massive market for autonomous driving technology.

Revenue Models

Licensing of autonomous driving softwaredevelopment of autonomous vehicle fleetspartnerships with automakers.

Resource Requirements

Compute Needs

High (requires significant compute for training RL agents and world models in simulation)

Data Requirements

Simulation data from environments like CARLA, potentially real-world driving data for fine-tuning.

Deployment Constraints

Requires sophisticated simulation environments; transferring learned policies to real-world vehicles (sim-to-real) remains a challenge.

Scalability

Scales with the complexity of the driving environment and the world model.

Regulatory Considerations

Significant regulatory hurdles for deployment of autonomous driving systems.

Production Readiness

Maturity Level

Research Prototype

Time to Market

5-10 years for widespread commercial deployment

Patent Potential

High (novel approach to MBRL for autonomous driving)

View Full Paper Back to Papers