TernausNet: U-Net with VGG11 Encoder Pre-Trained on ImageNet for Image Segmentation

About

Pixel-wise image segmentation is demanding task in computer vision. Classical U-Net architectures composed of encoders and decoders are very popular for segmentation of medical images, satellite images etc. Typically, neural network initialized with weights from a network pre-trained on a large data set like ImageNet shows better performance than those trained from scratch on a small dataset. In some practical applications, particularly in medicine and traffic safety, the accuracy of the models is of utmost importance. In this paper, we demonstrate how the U-Net type architecture can be improved by the use of the pre-trained encoder. Our code and corresponding pre-trained weights are publicly available at https://github.com/ternaus/TernausNet. We compare three weight initialization schemes: LeCun uniform, the encoder with weights from VGG11 and full network trained on the Carvana dataset. This network architecture was a part of the winning solution (1st out of 735) in the Kaggle: Carvana Image Masking Challenge.

Vladimir Iglovikov, Alexey Shvets• 2018

Related benchmarks

Task	Dataset	Result
Surgical Instrument Segmentation	EndoVis 2018 (test)	Ch_IoU46.22	32
Surgical Instrument Segmentation	EndoVis 2017 (test)	mIoU33.78	22
Surgical Tool Segmentation	CaDIS (test)	IoU (m)46.47	7
Tool Segmentation	Sankara-MSICS (test)	mIoU42.76	6
Surgical Instrument Segmentation	Endovis to Surgery Case 1	Mean Dice (Domain A)94.2	5
Surgical Instrument Segmentation	UCL to Surgery Case 2 Domain A: UCL, Domain B: Surgery	Dice (Domain A)95.8	5
Surgical Instrument Segmentation	Endovis to UCL Case 3	Mean Dice (Domain A)93.3	5
Surgical Instrument Segmentation	UCL to Endovis Case 4 (Domain A: UCL, Domain B: Endovis)	Mean Dice (Domain A)93.4	5
Instrument Part Segmentation	EndoVis 2018 (test)	mDice (%)61.78	5

Showing 9 of 9 rows

Other info

Follow for update

@wizwand_team Discord