Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Tri-Perspective View for Vision-Based 3D Semantic Occupancy Prediction

About

Modern methods for vision-centric autonomous driving perception widely adopt the bird's-eye-view (BEV) representation to describe a 3D scene. Despite its better efficiency than voxel representation, it has difficulty describing the fine-grained 3D structure of a scene with a single plane. To address this, we propose a tri-perspective view (TPV) representation which accompanies BEV with two additional perpendicular planes. We model each point in the 3D space by summing its projected features on the three planes. To lift image features to the 3D TPV space, we further propose a transformer-based TPV encoder (TPVFormer) to obtain the TPV features effectively. We employ the attention mechanism to aggregate the image features corresponding to each query in each TPV plane. Experiments show that our model trained with sparse supervision effectively predicts the semantic occupancy for all voxels. We demonstrate for the first time that using only camera inputs can achieve comparable performance with LiDAR-based methods on the LiDAR segmentation task on nuScenes. Code: https://github.com/wzzheng/TPVFormer.

Yuanhui Huang, Wenzhao Zheng, Yunpeng Zhang, Jie Zhou, Jiwen Lu• 2023

Related benchmarks

TaskDatasetResultRank
3D Occupancy PredictionOcc3D-nuScenes (val)
mIoU2.83e+3
144
LiDAR Semantic SegmentationnuScenes official (test)
mIoU69.4
132
Semantic Scene CompletionSemanticKITTI (val)
mIoU (Mean IoU)11.36
84
Semantic Scene CompletionSemanticKITTI official (test)
mIoU11.26
50
Semantic Scene CompletionSemanticKITTI (test)
Overall IoU34.25
48
Semantic Occupancy PredictionOcc3D (val)
mIoU34.2
37
3D Semantic Occupancy PredictionSurroundOcc (val)
mIoU0.171
36
Semantic Scene CompletionSSCBench-KITTI-360 (test)
IoU40.22
35
3D Semantic Occupancy PredictionSurroundOcc-nuScenes (val)
IoU30.86
31
Semantic Scene CompletionSemanticKITTI hidden (test)
SSC mIoU0.1126
23
Showing 10 of 23 rows

Other info

Code

Follow for update