Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Pixel-wise Attentional Gating for Parsimonious Pixel Labeling

About

To achieve parsimonious inference in per-pixel labeling tasks with a limited computational budget, we propose a \emph{Pixel-wise Attentional Gating} unit (\emph{PAG}) that learns to selectively process a subset of spatial locations at each layer of a deep convolutional network. PAG is a generic, architecture-independent, problem-agnostic mechanism that can be readily "plugged in" to an existing model with fine-tuning. We utilize PAG in two ways: 1) learning spatially varying pooling fields that improve model performance without the extra computation cost associated with multi-scale pooling, and 2) learning a dynamic computation policy for each pixel to decrease total computation while maintaining accuracy. We extensively evaluate PAG on a variety of per-pixel labeling tasks, including semantic segmentation, boundary detection, monocular depth and surface normal estimation. We demonstrate that PAG allows competitive or state-of-the-art performance on these tasks. Our experiments show that PAG learns dynamic spatial allocation of computation over the input image which provides better performance trade-offs compared to related approaches (e.g., truncating deep models or dynamically skipping whole layers). Generally, we observe PAG can reduce computation by $10\%$ without noticeable loss in accuracy and performance degrades gracefully when imposing stronger computational constraints.

Shu Kong, Charless Fowlkes• 2018

Related benchmarks

TaskDatasetResultRank
Semantic segmentationCityscapes
mIoU75.8
578
Surface Normal EstimationNYU v2 (test)--
206
Depth EstimationNYU Depth V2--
177
Monocular Depth EstimationKITTI (test)
Abs Rel Error11.74
103
Semantic segmentationNYU V2
mIoU46.5
74
Monocular Depth EstimationCityscapes
Accuracy (delta < 1.25)34.6
62
Boundary DetectionBSDS500
ODS F-score0.792
37
Semantic segmentationStanford-2D-3D
IoU83.7
21
Semantic segmentationWildDash bench (test)
mIoU Meta Avg (cla)22.2
19
Semantic segmentationKITTI (test)
mIoU78.11
16
Showing 10 of 14 rows

Other info

Code

Follow for update