Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Towards Unconstrained End-to-End Text Spotting

About

We propose an end-to-end trainable network that can simultaneously detect and recognize text of arbitrary shape, making substantial progress on the open problem of reading scene text of irregular shape. We formulate arbitrary shape text detection as an instance segmentation problem; an attention model is then used to decode the textual content of each irregularly shaped text region without rectification. To extract useful irregularly shaped text instance features from image scale features, we propose a simple yet effective RoI masking step. Additionally, we show that predictions from an existing multi-step OCR engine can be leveraged as partially labeled training data, which leads to significant improvements in both the detection and recognition accuracy of our model. Our method surpasses the state-of-the-art for end-to-end recognition tasks on the ICDAR15 (straight) benchmark by 4.6%, and on the Total-Text (curved) benchmark by more than 16%.

Siyang Qin, Alessandro Bissacco, Michalis Raptis, Yasuhisa Fujii, Ying Xiao• 2019

Related benchmarks

TaskDatasetResultRank
Text DetectionICDAR 2015
Precision91.7
171
Scene Text DetectionICDAR 2015 (test)
F1 Score89.78
150
Text DetectionICDAR 2015 (test)
F1 Score87.5
108
Scene Text DetectionTotalText (test)
Recall85
106
Scene Text SpottingTotal-Text (test)
F-measure (None)70.7
105
End-to-End Scene Text SpottingTotal-Text
Hmean (None)67.8
55
Text SpottingICDAR 2015 (test)
Accuracy (Strong Lexicon)83.4
36
End-to-end RecognitionTotal-Text
F1 Score63.9
22
End-to-End Text RecognitionTotal-Text (test)
F-measure (None)63.9
17
End-to-End Scene Text SpottingIC 2015 (test)
Strong Score85.51
16
Showing 10 of 10 rows

Other info

Follow for update