Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Pyramid Scene Parsing Network

About

Scene parsing is challenging for unrestricted open vocabulary and diverse scenes. In this paper, we exploit the capability of global context information by different-region-based context aggregation through our pyramid pooling module together with the proposed pyramid scene parsing network (PSPNet). Our global prior representation is effective to produce good quality results on the scene parsing task, while PSPNet provides a superior framework for pixel-level prediction tasks. The proposed approach achieves state-of-the-art performance on various datasets. It came first in ImageNet scene parsing challenge 2016, PASCAL VOC 2012 benchmark and Cityscapes benchmark. A single PSPNet yields new record of mIoU accuracy 85.4% on PASCAL VOC 2012 and accuracy 80.2% on Cityscapes.

Hengshuang Zhao, Jianping Shi, Xiaojuan Qi, Xiaogang Wang, Jiaya Jia• 2016

Related benchmarks

TaskDatasetResultRank
Semantic segmentationADE20K (val)
mIoU46.27
3069
Semantic segmentationPASCAL VOC 2012 (test)
mIoU85.4
1477
Semantic segmentationCityscapes (test)
mIoU81.2
1252
Semantic segmentationADE20K
mIoU43.51
1028
Semantic segmentationCityscapes
mIoU78.7
668
Semantic segmentationCityscapes (val)
mIoU79.8
572
Semantic segmentationCityscapes (val)
mIoU79.8
527
Semantic segmentationCamVid (test)
mIoU69.1
411
Semantic segmentationPASCAL VOC (val)
mIoU73.33
380
Semantic segmentationPASCAL Context (val)
mIoU47.8
360
Showing 10 of 224 rows
...

Other info

Code

Follow for update