Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Mask R-CNN with Pyramid Attention Network for Scene Text Detection

About

In this paper, we present a new Mask R-CNN based text detection approach which can robustly detect multi-oriented and curved text from natural scene images in a unified manner. To enhance the feature representation ability of Mask R-CNN for text detection tasks, we propose to use the Pyramid Attention Network (PAN) as a new backbone network of Mask R-CNN. Experiments demonstrate that PAN can suppress false alarms caused by text-like backgrounds more effectively. Our proposed approach has achieved superior performance on both multi-oriented (ICDAR-2015, ICDAR-2017 MLT) and curved (SCUT-CTW1500) text detection benchmark tasks by only using single-scale and single-model testing.

Zhida Huang, Zhuoyao Zhong, Lei Sun, Qiang Huo• 2018

Related benchmarks

TaskDatasetResultRank
Text DetectionICDAR 2015
Precision90.8
171
Text DetectionICDAR 2015 (test)
F1 Score85.9
108
Text DetectionICDAR MLT 2017 (test)
Precision80
101
Text DetectionSCUT-CTW1500
Precision86.8
39
Text DetectionCTW1500 Whole set (test)
Recall83.2
24
Scene Text DetectionSCUT-CTW1500 (test)
F-measure85
14
Showing 6 of 6 rows

Other info

Follow for update