Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WeatherOcc3D: VLM-Assisted Adverse Weather Aware 3D Semantic Occupancy Prediction

About

While multi-modal 3D semantic occupancy prediction typically enhances robustness by fusing camera and LiDAR inputs, its effectiveness is fundamentally constrained by environmental variability. Specifically, camera sensors suffer from severe low-light degradation, while LiDAR sensors encounter significant backscatter noise during heavy precipitation. These adverse conditions create a modality trust problem, as static fusion strategies fail to adaptively re-weight inputs when a specific sensor becomes unreliable. To address this, we propose a VLM-assisted framework leveraging the pre-trained CLIP latent space to guide multi-sensor integration via linguistic environmental cues. We utilize a parameter-efficient adapter to align weather-specific text embeddings with sensor features, coupled with a gating strategy that decomposes environmental uncertainty into two factors: visibility and illumination. This enables the model to dynamically modulate the fusion ratio - prioritizing semantic camera features in clear daylight and shifting to geometric LiDAR priors during rainy nights. Evaluations on the nuScenes dataset demonstrate the versatility of our approach, as implementing our proposed framework on the OccMamba and M-CONet architectures achieves mIoU scores of 26.3 and 21.1, respectively, significantly outperforming their traditional baselines.

A. Enes Doruk, Abdelaziz Hussein, Hasan F. Ates• 2026

Related benchmarks

TaskDatasetResultRank
3D Semantic Occupancy PredictionnuScenes-OpenOccupancy (val)
Mean IoU26.3
11
Showing 1 of 1 rows

Other info

Follow for update