Rethinking Atrous Convolution for Semantic Image Segmentation

About

In this work, we revisit atrous convolution, a powerful tool to explicitly adjust filter's field-of-view as well as control the resolution of feature responses computed by Deep Convolutional Neural Networks, in the application of semantic image segmentation. To handle the problem of segmenting objects at multiple scales, we design modules which employ atrous convolution in cascade or in parallel to capture multi-scale context by adopting multiple atrous rates. Furthermore, we propose to augment our previously proposed Atrous Spatial Pyramid Pooling module, which probes convolutional features at multiple scales, with image-level features encoding global context and further boost performance. We also elaborate on implementation details and share our experience on training our system. The proposed `DeepLabv3' system significantly improves over our previous DeepLab versions without DenseCRF post-processing and attains comparable performance with other state-of-art models on the PASCAL VOC 2012 semantic image segmentation benchmark.

Liang-Chieh Chen, George Papandreou, Florian Schroff, Hartwig Adam• 2017

Related benchmarks

Task	Dataset	Result
Semantic segmentation	ADE20K (val)	mIoU48.36	3089
Semantic segmentation	PASCAL VOC 2012 (val)	Mean IoU82.7	2210
Semantic segmentation	PASCAL VOC 2012 (test)	mIoU86.9	1485
Semantic segmentation	Cityscapes (test)	mIoU81.34	1254
Semantic segmentation	ADE20K	--	1028
Semantic segmentation	ADE20K	mIoU42.7	699
Semantic segmentation	Cityscapes	--	674
Semantic segmentation	Cityscapes (val)	mIoU78.5	572
Semantic segmentation	Cityscapes (val)	mIoU80.2	552
Semantic segmentation	Cityscapes	mIoU80.1	526

Showing 10 of 120 rows

...

Other info

Follow for update

@wizwand_team Discord