Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

MaX-DeepLab: End-to-End Panoptic Segmentation with Mask Transformers

About

We present MaX-DeepLab, the first end-to-end model for panoptic segmentation. Our approach simplifies the current pipeline that depends heavily on surrogate sub-tasks and hand-designed components, such as box detection, non-maximum suppression, thing-stuff merging, etc. Although these sub-tasks are tackled by area experts, they fail to comprehensively solve the target task. By contrast, our MaX-DeepLab directly predicts class-labeled masks with a mask transformer, and is trained with a panoptic quality inspired loss via bipartite matching. Our mask transformer employs a dual-path architecture that introduces a global memory path in addition to a CNN path, allowing direct communication with any CNN layers. As a result, MaX-DeepLab shows a significant 7.1% PQ gain in the box-free regime on the challenging COCO dataset, closing the gap between box-based and box-free methods for the first time. A small variant of MaX-DeepLab improves 3.0% PQ over DETR with similar parameters and M-Adds. Furthermore, MaX-DeepLab, without test time augmentation, achieves new state-of-the-art 51.3% PQ on COCO test-dev set. Code is available at https://github.com/google-research/deeplab2.

Huiyu Wang, Yukun Zhu, Hartwig Adam, Alan Yuille, Liang-Chieh Chen• 2020

Related benchmarks

TaskDatasetResultRank
Panoptic SegmentationCityscapes (val)
PQ61.7
276
Panoptic SegmentationCOCO (val)
PQ51.1
219
Panoptic SegmentationCOCO 2017 (val)
PQ51.1
172
Panoptic SegmentationCOCO (test-dev)
PQ51.3
162
Panoptic SegmentationCOCO 2017 (test-dev)
PQ51.3
41
Panoptic SegmentationCOCO (test)
PQ49
23
Panoptic SegmentationCOCO panoptic 133 categories (val)
PQ51.1
12
Showing 7 of 7 rows

Other info

Code

Follow for update