Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

FOTS: Fast Oriented Text Spotting with a Unified Network

About

Incidental scene text spotting is considered one of the most difficult and valuable challenges in the document analysis community. Most existing methods treat text detection and recognition as separate tasks. In this work, we propose a unified end-to-end trainable Fast Oriented Text Spotting (FOTS) network for simultaneous detection and recognition, sharing computation and visual information among the two complementary tasks. Specially, RoIRotate is introduced to share convolutional features between detection and recognition. Benefiting from convolution sharing strategy, our FOTS has little computation overhead compared to baseline text detection network, and the joint training method learns more generic features to make our method perform better than these two-stage methods. Experiments on ICDAR 2015, ICDAR 2017 MLT, and ICDAR 2013 datasets demonstrate that the proposed method outperforms state-of-the-art methods significantly, which further allows us to develop the first real-time oriented text spotting system which surpasses all previous state-of-the-art results by more than 5% on ICDAR 2015 text spotting task while keeping 22.6 fps.

Xuebo Liu, Ding Liang, Shi Yan, Dagui Chen, Yu Qiao, Junjie Yan• 2018

Related benchmarks

TaskDatasetResultRank
Text DetectionICDAR 2015
Precision91.9
171
Scene Text DetectionICDAR 2015 (test)
F1 Score89.84
150
Oriented Text DetectionICDAR 2015 (test)
Precision88.8
129
Text DetectionICDAR 2015 (test)
F1 Score89.84
108
Scene Text DetectionTotalText (test)
Recall38
106
Scene Text SpottingTotal-Text (test)
F-measure (None)32.2
105
Text DetectionICDAR MLT 2017 (test)
Precision81.86
101
Text DetectionICDAR 2013 (test)
F1 Score92.5
88
End-to-End Text SpottingICDAR 2015
Strong Score83.6
80
End-to-End Text SpottingICDAR 2015 (test)--
62
Showing 10 of 29 rows

Other info

Follow for update