Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Fast and Accurate Scene Parsing via Bi-direction Alignment Networks

About

In this paper, we propose an effective method for fast and accurate scene parsing called Bidirectional Alignment Network (BiAlignNet). Previously, one representative work BiSeNet~\cite{bisenet} uses two different paths (Context Path and Spatial Path) to achieve balanced learning of semantics and details, respectively. However, the relationship between the two paths is not well explored. We argue that both paths can benefit each other in a complementary way. Motivated by this, we propose a novel network by aligning two-path information into each other through a learned flow field. To avoid the noise and semantic gaps, we introduce a Gated Flow Alignment Module to align both features in a bidirectional way. Moreover, to make the Spatial Path learn more detailed information, we present an edge-guided hard pixel mining loss to supervise the aligned learning process. Our method achieves 80.1\% and 78.5\% mIoU in validation and test set of Cityscapes while running at 30 FPS with full resolution inputs. Code and models will be available at \url{https://github.com/jojacola/BiAlignNet}.

Yanran Wu, Xiangtai Li, Chen Shi, Yunhai Tong, Yang Hua, Tao Song, Ruhui Ma, Haibing Guan• 2021

Related benchmarks

TaskDatasetResultRank
Semantic segmentationCityscapes (test)
mIoU78.5
1145
Semantic segmentationCamVid (test)
mIoU77.1
411
Semantic segmentationCityscapes (val)
mIoU80.1
287
Semantic segmentationBDD (val)
mIoU53.4
47
Semantic segmentationUDS merged (test)
mIoU73.9
18
Semantic segmentationIDD original size (val)
mIoU73.9
17
Showing 6 of 6 rows

Other info

Code

Follow for update