PanopticDepth: A Unified Framework for Depth-aware Panoptic Segmentation
About
This paper presents a unified framework for depth-aware panoptic segmentation (DPS), which aims to reconstruct 3D scene with instance-level semantics from one single image. Prior works address this problem by simply adding a dense depth regression head to panoptic segmentation (PS) networks, resulting in two independent task branches. This neglects the mutually-beneficial relations between these two tasks, thus failing to exploit handy instance-level semantic cues to boost depth accuracy while also producing sub-optimal depth maps. To overcome these limitations, we propose a unified framework for the DPS task by applying a dynamic convolution technique to both the PS and depth prediction tasks. Specifically, instead of predicting depth for all pixels at a time, we generate instance-specific kernels to predict depth and segmentation masks for each instance. Moreover, leveraging the instance-wise depth estimation scheme, we add additional instance-level depth cues to assist with supervising the depth learning via a new depth loss. Extensive experiments on Cityscapes-DPS and SemKITTI-DPS show the effectiveness and promise of our method. We hope our unified solution to DPS can lead a new paradigm in this area. Code is available at https://github.com/NaiyuGao/PanopticDepth.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Panoptic Segmentation | Cityscapes (val) | PQ64.1 | 276 | |
| Depth Prediction | Cityscapes (test) | RMSE6.69 | 52 | |
| Panoptic Segmentation | Cityscapes (test) | PQ62 | 51 | |
| Depth-aware Panoptic Segmentation | Cityscapes-DPS (val) | DPQ67.4 | 16 | |
| Monocular Depth Estimation | Cityscapes (val) | RMSE6.91 | 2 |