Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Laplacian Pyramid Reconstruction and Refinement for Semantic Segmentation

About

CNN architectures have terrific recognition performance but rely on spatial pooling which makes it difficult to adapt them to tasks that require dense, pixel-accurate labeling. This paper makes two contributions: (1) We demonstrate that while the apparent spatial resolution of convolutional feature maps is low, the high-dimensional feature representation contains significant sub-pixel localization information. (2) We describe a multi-resolution reconstruction architecture based on a Laplacian pyramid that uses skip connections from higher resolution feature maps and multiplicative gating to successively refine segment boundaries reconstructed from lower-resolution maps. This approach yields state-of-the-art semantic segmentation results on the PASCAL VOC and Cityscapes segmentation benchmarks without resorting to more complex random-field inference or instance detection driven architectures.

Golnaz Ghiasi, Charless C. Fowlkes• 2016

Related benchmarks

TaskDatasetResultRank
Semantic segmentationPASCAL VOC 2012 (test)
mIoU79.3
1415
Semantic segmentationCityscapes (test)
mIoU71.8
1154
Semantic segmentationCityscapes
mIoU70
658
Semantic segmentationCityscapes (val)
mIoU70
572
Semantic segmentationCityscapes (val)
mIoU70
108
Semantic segmentationCityscapes fine (test)
mIoU69.7
44
Semantic segmentationCityscapes fine (val)
mIoU70
42
Scene ParsingCityscapes fine-annotation (test)
mIoU (Class)69.7
10
Showing 8 of 8 rows

Other info

Follow for update