LinkNet: Exploiting Encoder Representations for Efficient Semantic Segmentation

About

Pixel-wise semantic segmentation for visual scene understanding not only needs to be accurate, but also efficient in order to find any use in real-time application. Existing algorithms even though are accurate but they do not focus on utilizing the parameters of neural network efficiently. As a result they are huge in terms of parameters and number of operations; hence slow too. In this paper, we propose a novel deep neural network architecture which allows it to learn without any significant increase in number of parameters. Our network uses only 11.5 million parameters and 21.2 GFLOPs for processing an image of resolution 3x640x360. It gives state-of-the-art performance on CamVid and comparable results on Cityscapes dataset. We also compare our networks processing time on NVIDIA GPU and embedded system device with existing state-of-the-art architectures for different image resolutions.

Abhishek Chaurasia, Eugenio Culurciello• 2017

Related benchmarks

Task	Dataset	Result
Semantic segmentation	Cityscapes (test)	--	1252
Semantic segmentation	Cityscapes (val)	--	572
Semantic segmentation	LoveDA	mIoU48.5	97
Semantic segmentation	LoveDA (test)	mIoU48.5	92
Semantic segmentation	Mapillary Vistas (val)	mIoU47.7	84
Road Extraction	Massachusetts	mIoU61.59	67
Semantic segmentation	Cityscapes	Throughput (FPS)65.8	42
Road Segmentation	DeepGlobe	IoU64.42	41
Road Segmentation	Massachusetts Road Dataset	IoU (Average)0.6312	35
Road Segmentation	CHN6-CUG	mIoU33.93	34

Showing 10 of 25 rows

Other info

Follow for update

@wizwand_team Discord