Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Single Shot Text Detector with Regional Attention

About

We present a novel single-shot text detector that directly outputs word-level bounding boxes in a natural image. We propose an attention mechanism which roughly identifies text regions via an automatically learned attentional map. This substantially suppresses background interference in the convolutional features, which is the key to producing accurate inference of words, particularly at extremely small sizes. This results in a single model that essentially works in a coarse-to-fine manner. It departs from recent FCN- based text detectors which cascade multiple FCN models to achieve an accurate prediction. Furthermore, we develop a hierarchical inception module which efficiently aggregates multi-scale inception features. This enhances local details, and also encodes strong context information, allow- ing the detector to work reliably on multi-scale and multi- orientation text with single-scale images. Our text detector achieves an F-measure of 77% on the ICDAR 2015 bench- mark, advancing the state-of-the-art results in [18, 28]. Demo is available at: http://sstd.whuang.org/.

Pan He, Weilin Huang, Tong He, Qile Zhu, Yu Qiao, Xiaolin Li• 2017

Related benchmarks

TaskDatasetResultRank
Text DetectionICDAR 2015
Precision80.23
171
Scene Text DetectionICDAR 2015 (test)
F1 Score77
150
Oriented Text DetectionICDAR 2015 (test)
Precision80.2
129
Text DetectionICDAR 2015 (test)
F1 Score76.91
108
Text DetectionICDAR 2013 (test)
F1 Score88
88
Text DetectionICDAR Incidental Text 2015 (test)
Precision80
52
Text DetectionCOCO-text (test)
Recall31
19
Scene Text DetectionCOCO-Text V1.1 (test)
Precision46
9
Showing 8 of 8 rows

Other info

Code

Follow for update