Efficient piecewise training of deep structured models for semantic segmentation

About

Recent advances in semantic image segmentation have mostly been achieved by training deep convolutional neural networks (CNNs). We show how to improve semantic segmentation through the use of contextual information; specifically, we explore `patch-patch' context between image regions, and `patch-background' context. For learning from the patch-patch context, we formulate Conditional Random Fields (CRFs) with CNN-based pairwise potential functions to capture semantic correlations between neighboring patches. Efficient piecewise training of the proposed deep structured model is then applied to avoid repeated expensive CRF inference for back propagation. For capturing the patch-background context, we show that a network design with traditional multi-scale image input and sliding pyramid pooling is effective for improving performance. Our experimental results set new state-of-the-art performance on a number of popular semantic segmentation datasets, including NYUDv2, PASCAL VOC 2012, PASCAL-Context, and SIFT-flow. In particular, we achieve an intersection-over-union score of 78.0 on the challenging PASCAL VOC 2012 dataset.

Guosheng Lin, Chunhua Shen, Anton van dan Hengel, Ian Reid• 2015

Related benchmarks

Task	Dataset	Result
Semantic segmentation	PASCAL VOC 2012 (test)	mIoU79.1	1485
Semantic segmentation	Cityscapes (test)	mIoU71.6	1254
Semantic segmentation	Cityscapes (val)	mIoU71.6	572
Semantic segmentation	PASCAL Context (val)	mIoU43.3	360
Semantic segmentation	NYU v2 (test)	mIoU40.6	304
Semantic segmentation	Pascal VOC (test)	mIoU78	268
Semantic segmentation	Pascal Context (test)	mIoU43.3	223
Semantic segmentation	Pascal Context	mIoU43.3	217
Semantic segmentation	NYU Depth V2 (test)	mIoU40.6	183
Semantic segmentation	NYUD v2	mIoU40.6	169

Showing 10 of 27 rows

Other info

Follow for update

@wizwand_team Discord