Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ALOcc: Adaptive Lifting-Based 3D Semantic Occupancy and Cost Volume-Based Flow Predictions

About

3D semantic occupancy and flow prediction are fundamental to spatiotemporal scene understanding. This paper proposes a vision-based framework with three targeted improvements. First, we introduce an occlusion-aware adaptive lifting mechanism incorporating depth denoising. This enhances the robustness of 2D-to-3D feature transformation while mitigating reliance on depth priors. Second, we enforce 3D-2D semantic consistency via jointly optimized prototypes, using confidence- and category-aware sampling to address the long-tail classes problem. Third, to streamline joint prediction, we devise a BEV-centric cost volume to explicitly correlate semantic and flow features, supervised by a hybrid classification-regression scheme that handles diverse motion scales. Our purely convolutional architecture establishes new SOTA performance on multiple benchmarks for both semantic occupancy and joint occupancy semantic-flow prediction. We also present a family of models offering a spectrum of efficiency-performance trade-offs. Our real-time version exceeds all existing real-time methods in speed and accuracy, ensuring its practical viability.

Dubing Chen, Jin Fang, Wencheng Han, Xinjing Cheng, Junbo Yin, Chenzhong Xu, Fahad Shahbaz Khan, Jianbing Shen• 2024

Related benchmarks

TaskDatasetResultRank
3D Occupancy PredictionOcc3D-nuScenes (val)
mIoU50.6
144
3D Semantic Occupancy PredictionSurroundOcc (val)
mIoU0.24
36
3D Semantic Occupancy PredictionOcc3D
RayIoU50.6
34
Occupancy PredictionOcc3D v1.0 (test)
RayIoU (Default)43.7
24
3D Semantic Occupancy PredictionOpenOccupancy
mIoU22.4
15
3D Occupancy and Occupancy FlowOpenOcc (val)
OccScore43
10
Showing 6 of 6 rows

Other info

Follow for update