Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

ESPNetv2: A Light-weight, Power Efficient, and General Purpose Convolutional Neural Network

About

We introduce a light-weight, power efficient, and general purpose convolutional neural network, ESPNetv2, for modeling visual and sequential data. Our network uses group point-wise and depth-wise dilated separable convolutions to learn representations from a large effective receptive field with fewer FLOPs and parameters. The performance of our network is evaluated on four different tasks: (1) object classification, (2) semantic segmentation, (3) object detection, and (4) language modeling. Experiments on these tasks, including image classification on the ImageNet and language modeling on the PenTree bank dataset, demonstrate the superior performance of our method over the state-of-the-art methods. Our network outperforms ESPNet by 4-5% and has 2-4x fewer FLOPs on the PASCAL VOC and the Cityscapes dataset. Compared to YOLOv2 on the MS-COCO object detection, ESPNetv2 delivers 4.4% higher accuracy with 6x fewer FLOPs. Our experiments show that ESPNetv2 is much more power efficient than existing state-of-the-art efficient methods including ShuffleNets and MobileNets. Our code is open-source and available at https://github.com/sacmehta/ESPNetv2

Sachin Mehta, Mohammad Rastegari, Linda Shapiro, Hannaneh Hajishirzi• 2018

Related benchmarks

TaskDatasetResultRank
Object DetectionCOCO 2017 (val)
AP26
2454
Semantic segmentationCityscapes (test)
mIoU66.2
1145
Object DetectionPASCAL VOC 2007 (test)
mAP75
821
Semantic segmentationCityscapes (val)
mIoU66.4
572
Semantic segmentationCityscapes (val)
mIoU66.4
287
Object DetectionMS-COCO 2017 (val)--
237
Semantic segmentationCityscapes (val)
mIoU66.4
108
Semantic segmentationTrans10K v2 (test)
mIoU12.27
104
Language ModelingPenn Treebank word-level (test)
Perplexity63.47
72
Semantic segmentationCityscapes fine (test)
mIoU66.2
44
Showing 10 of 19 rows

Other info

Code

Follow for update