SegNeXt: Rethinking Convolutional Attention Design for Semantic Segmentation

About

We present SegNeXt, a simple convolutional network architecture for semantic segmentation. Recent transformer-based models have dominated the field of semantic segmentation due to the efficiency of self-attention in encoding spatial information. In this paper, we show that convolutional attention is a more efficient and effective way to encode contextual information than the self-attention mechanism in transformers. By re-examining the characteristics owned by successful segmentation models, we discover several key components leading to the performance improvement of segmentation models. This motivates us to design a novel convolutional attention network that uses cheap convolutional operations. Without bells and whistles, our SegNeXt significantly improves the performance of previous state-of-the-art methods on popular benchmarks, including ADE20K, Cityscapes, COCO-Stuff, Pascal VOC, Pascal Context, and iSAID. Notably, SegNeXt outperforms EfficientNet-L2 w/ NAS-FPN and achieves 90.6% mIoU on the Pascal VOC 2012 test leaderboard using only 1/10 parameters of it. On average, SegNeXt achieves about 2.0% mIoU improvements compared to the state-of-the-art methods on the ADE20K datasets with the same or fewer computations. Code is available at https://github.com/uyzhang/JSeg (Jittor) and https://github.com/Visual-Attention-Network/SegNeXt (Pytorch).

Meng-Hao Guo, Cheng-Ze Lu, Qibin Hou, Zhengning Liu, Ming-Ming Cheng, Shi-Min Hu• 2022

Related benchmarks

Task	Dataset	Result
Semantic segmentation	ADE20K (val)	mIoU52.1	3089
Semantic segmentation	PASCAL VOC 2012 (test)	mIoU90.6	1485
Semantic segmentation	Cityscapes (test)	mIoU78	1254
Image Classification	ImageNet-1K	Top-1 Acc83.9	1239
Semantic segmentation	ADE20K	mIoU48.5	1028
Image Classification	ImageNet-1k (val)	Top-1 Accuracy83.9	960
Semantic segmentation	ADE20K	mIoU48.5	699
Semantic segmentation	Cityscapes	mIoU83.2	674
Semantic segmentation	Cityscapes (val)	mIoU83.2	572
Semantic segmentation	Cityscapes (val)	mIoU82.6	552

Showing 10 of 59 rows

Other info

Follow for update

@wizwand_team Discord