Detecting Text in Natural Image with Connectionist Text Proposal Network
About
We propose a novel Connectionist Text Proposal Network (CTPN) that accurately localizes text lines in natural image. The CTPN detects a text line in a sequence of fine-scale text proposals directly in convolutional feature maps. We develop a vertical anchor mechanism that jointly predicts location and text/non-text score of each fixed-width proposal, considerably improving localization accuracy. The sequential proposals are naturally connected by a recurrent neural network, which is seamlessly incorporated into the convolutional network, resulting in an end-to-end trainable model. This allows the CTPN to explore rich context information of image, making it powerful to detect extremely ambiguous text. The CTPN works reliably on multi-scale and multi- language text without further post-processing, departing from previous bottom-up methods requiring multi-step post-processing. It achieves 0.88 and 0.61 F-measure on the ICDAR 2013 and 2015 benchmarks, surpass- ing recent results [8, 35] by a large margin. The CTPN is computationally efficient with 0:14s/image, by using the very deep VGG16 model [27]. Online demo is available at: http://textdet.com/.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text Detection | ICDAR 2015 | Precision74.2 | 171 | |
| Text Detection | CTW1500 (test) | Precision60.4 | 157 | |
| Scene Text Detection | ICDAR 2015 (test) | F1 Score61 | 150 | |
| Oriented Text Detection | ICDAR 2015 (test) | Precision74.2 | 129 | |
| Text Detection | ICDAR 2015 (test) | F1 Score60.85 | 108 | |
| Text Detection | ICDAR 2013 (test) | F1 Score88 | 88 | |
| Text Detection | CTW1500 | F-measure56.9 | 70 | |
| Text Detection | ICDAR Incidental Text 2015 (test) | Precision74 | 52 | |
| Text Localization | ICDAR 2013 (test) | Recall84 | 28 | |
| Scene Text Detection | ReCTS 2019 (test) | Recall96.17 | 24 |