Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

About

Spatial pooling has been proven highly effective in capturing long-range contextual information for pixel-wise prediction tasks, such as scene parsing. In this paper, beyond conventional spatial pooling that usually has a regular shape of NxN, we rethink the formulation of spatial pooling by introducing a new pooling strategy, called strip pooling, which considers a long but narrow kernel, i.e., 1xN or Nx1. Based on strip pooling, we further investigate spatial pooling architecture design by 1) introducing a new strip pooling module that enables backbone networks to efficiently model long-range dependencies, 2) presenting a novel building block with diverse spatial pooling as a core, and 3) systematically comparing the performance of the proposed strip pooling and conventional spatial pooling techniques. Both novel pooling-based designs are lightweight and can serve as an efficient plug-and-play module in existing scene parsing networks. Extensive experiments on popular benchmarks (e.g., ADE20K and Cityscapes) demonstrate that our simple approach establishes new state-of-the-art results. Code is made available at https://github.com/Andrew-Qibin/SPNet.

Qibin Hou, Li Zhang, Ming-Ming Cheng, Jiashi Feng• 2020

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU45.6
2731
Semantic segmentationCityscapes (test)
mIoU82
1145
Semantic segmentationCityscapes (val)
mIoU81.9
332
Semantic segmentationPascal Context (test)
mIoU54.5
176
Semantic segmentationPASCAL-Context 60 classes (test)
mIoU54.5
54
Face ParsingiBugMask (test)
Left Brow73.2
6
Showing 6 of 6 rows

Other info

Code

Follow for update