Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Street-View Image Generation from a Bird's-Eye View Layout

About

Bird's-Eye View (BEV) Perception has received increasing attention in recent years as it provides a concise and unified spatial representation across views and benefits a diverse set of downstream driving applications. At the same time, data-driven simulation for autonomous driving has been a focal point of recent research but with few approaches that are both fully data-driven and controllable. Instead of using perception data from real-life scenarios, an ideal model for simulation would generate realistic street-view images that align with a given HD map and traffic layout, a task that is critical for visualizing complex traffic scenarios and developing robust perception models for autonomous driving. In this paper, we propose BEVGen, a conditional generative model that synthesizes a set of realistic and spatially consistent surrounding images that match the BEV layout of a traffic scenario. BEVGen incorporates a novel cross-view transformation with spatial attention design which learns the relationship between cameras and map views to ensure their consistency. We evaluate the proposed model on the challenging NuScenes and Argoverse 2 datasets. After training, BEVGen can accurately render road and lane lines, as well as generate traffic scenes with diverse different weather conditions and times of day.

Alexander Swerdlow, Runsheng Xu, Bolei Zhou• 2023

Related benchmarks

TaskDatasetResultRank
Object DetectionnuScenes (val)--
41
Map SegmentationnuScenes (val)--
23
Video PredictionnuScenes (val)
FID25.54
16
Camera GenerationnuScenes v1.0-trainval (val)
FID25.54
11
Driving Scene GenerationnuScenes (val)
FID25.54
9
Controllable Image GenerationnuScenes (val)
FID25.5
7
Controllable Multi-view Video GenerationnuScenes (val)
mIoU (Foreground)5.89
7
Object DetectionnuScenes v1.0-trainval (val)
AP (Car)47.3
5
Map SegmentationnuScenes v1.0-trainval (val)
mIoU Road71.9
5
Place RecognitionnuScenes (val)
AR@131.2
4
Showing 10 of 10 rows

Other info

Follow for update