TextBoxes: A Fast Text Detector with a Single Deep Neural Network
About
This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.
Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu• 2016
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Text Detection | Total-Text | Recall45.5 | 139 | |
| Text Detection | Total-Text (test) | F-Measure52.5 | 126 | |
| Scene Text Detection | TotalText (test) | Recall45.5 | 106 | |
| Scene Text Spotting | Total-Text (test) | F-measure (None)36.3 | 105 | |
| Text Detection | ICDAR 2013 (test) | F1 Score86 | 88 | |
| Text Localization | ICDAR 2013 (test) | Recall83 | 28 | |
| End-to-End Text Spotting | ICDAR 2013 (test) | Score S93.9 | 25 | |
| End-to-end Recognition | Total-Text | F1 Score36.3 | 22 | |
| End-to-End Text Recognition | Total-Text (test) | F-measure (None)36.3 | 17 | |
| Word Spotting | ICDAR 2013 | Generic Score87 | 12 |
Showing 10 of 15 rows