Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Driving into the Future: Multiview Visual Forecasting and Planning with World Model for Autonomous Driving

About

In autonomous driving, predicting future events in advance and evaluating the foreseeable risks empowers autonomous vehicles to better plan their actions, enhancing safety and efficiency on the road. To this end, we propose Drive-WM, the first driving world model compatible with existing end-to-end planning models. Through a joint spatial-temporal modeling facilitated by view factorization, our model generates high-fidelity multiview videos in driving scenes. Building on its powerful generation ability, we showcase the potential of applying the world model for safe driving planning for the first time. Particularly, our Drive-WM enables driving into multiple futures based on distinct driving maneuvers, and determines the optimal trajectory according to the image-based rewards. Evaluation on real-world driving datasets verifies that our method could generate high-quality, consistent, and controllable multiview videos, opening up possibilities for real-world simulations and safe planning.

Yuqi Wang, Jiawei He, Lue Fan, Hongxin Li, Yuntao Chen, Zhaoxiang Zhang• 2023

Related benchmarks

TaskDatasetResultRank
Open-loop planningnuScenes (val)
L2 Error (3s)1.2
177
PlanningnuScenes (val)
Collision Rate (Avg)26
80
Video GenerationnuScenes (val)
FVD122.7
48
PlanningnuScenes v1.0-trainval (val)
ST-P3 L2 Error (1s)0.43
39
PlanningnuScenes
L2 Error (Avg)0.8
24
Video GenerationnuScenes
FVD122.7
17
Video PredictionnuScenes (val)
FID12.99
16
Frame predictionnuScenes
FID15.8
16
Camera GenerationnuScenes v1.0-trainval (val)
FID15.8
11
Future frames generationBench2Drive (test)
FID17.8
8
Showing 10 of 12 rows

Other info

Code

Follow for update