Self-supervised Augmentation Consistency for Adapting Semantic Segmentation

About

We propose an approach to domain adaptation for semantic segmentation that is both practical and highly accurate. In contrast to previous work, we abandon the use of computationally involved adversarial objectives, network ensembles and style transfer. Instead, we employ standard data augmentation techniques $-$ photometric noise, flipping and scaling $-$ and ensure consistency of the semantic predictions across these image transformations. We develop this principle in a lightweight self-supervised framework trained on co-evolving pseudo labels without the need for cumbersome extra training rounds. Simple in training from a practitioner's standpoint, our approach is remarkably effective. We achieve significant improvements of the state-of-the-art segmentation accuracy after adaptation, consistent both across different choices of the backbone architecture and adaptation scenarios.

Nikita Araslanov, Stefan Roth• 2021

Related benchmarks

Task	Dataset	Result
Semantic segmentation	Cityscapes (test)	mIoU55.7	1252
Semantic segmentation	GTA5 → Cityscapes (val)	mIoU53.8	586
Semantic segmentation	Cityscapes (val)	mIoU53.8	572
Semantic segmentation	SYNTHIA to Cityscapes (val)	Rider IoU25.4	480
Semantic segmentation	Cityscapes (val)	mIoU52.6	301
Semantic segmentation	SYNTHIA to Cityscapes	Road IoU89.3	159
Semantic segmentation	GTA5 to Cityscapes (test)	mIoU55.7	151
Semantic segmentation	Synthia to Cityscapes (test)	Road IoU89.3	138
Semantic segmentation	GTA5 to Cityscapes 1.0 (val)	Road IoU90.4	98
Semantic segmentation	GTA to Cityscapes	Road IoU90.4	72

Showing 10 of 20 rows

Other info

Code

Follow for update

@wizwand_team Discord