Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Open-Vocabulary Semantic Segmentation with Decoupled One-Pass Network

About

Recently, the open-vocabulary semantic segmentation problem has attracted increasing attention and the best performing methods are based on two-stream networks: one stream for proposal mask generation and the other for segment classification using a pretrained visual-language model. However, existing two-stream methods require passing a great number of (up to a hundred) image crops into the visual-language model, which is highly inefficient. To address the problem, we propose a network that only needs a single pass through the visual-language model for each input image. Specifically, we first propose a novel network adaptation approach, termed patch severance, to restrict the harmful interference between the patch embeddings in the pre-trained visual encoder. We then propose classification anchor learning to encourage the network to spatially focus on more discriminative features for classification. Extensive experiments demonstrate that the proposed method achieves outstanding performance, surpassing state-of-the-art methods while being 4 to 7 times faster at inference. Code: https://github.com/CongHan0808/DeOP.git

Cong Han, Yujie Zhong, Dengjie Li, Kai Han, Lin Ma• 2023

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPascal VOC (test)
mIoU91.7
236
Semantic segmentationADE20K A-150
mIoU22.9
188
Semantic segmentationPascal Context 59
mIoU48.8
164
Semantic segmentationPASCAL-Context 59 class (val)
mIoU48.8
125
Semantic segmentationADE20K 847
mIoU710
83
Semantic segmentationPASCAL-Context PC-459
mIoU9.4
69
Semantic segmentationPascal Context 59
mIoU48.8
67
Open Vocabulary Semantic SegmentationPascal VOC 20
mIoU91.7
62
Open Vocabulary Semantic SegmentationADE-847
mIoU7.1
59
Semantic segmentationPascal Context 459
mIoU9.4
58
Showing 10 of 23 rows

Other info

Code

Follow for update