Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Synthetic Data for Text Localisation in Natural Images

About

In this paper we introduce a new method for text detection in natural images. The method comprises two contributions: First, a fast and scalable engine to generate synthetic images of text in clutter. This engine overlays synthetic text to existing background images in a natural way, accounting for the local 3D scene geometry. Second, we use the synthetic images to train a Fully-Convolutional Regression Network (FCRN) which efficiently performs text detection and bounding-box regression at all locations and multiple scales in an image. We discuss the relation of FCRN to the recently-introduced YOLO detector, as well as other end-to-end object detection systems based on deep learning. The resulting detection network significantly out performs current methods for text detection in natural images, achieving an F-measure of 84.2% on the standard ICDAR 2013 benchmark. Furthermore, it can process 15 images per second on a GPU.

Ankush Gupta, Andrea Vedaldi, Andrew Zisserman• 2016

Related benchmarks

TaskDatasetResultRank
Text DetectionICDAR 2013 (test)
F1 Score83
88
Text LocalizationICDAR 2013 (test)
Recall76
28
End-to-End Text SpottingICDAR 2013 (test)--
25
End-to-End Text SpottingICDAR 2011 (test)
F-measure84.3
12
End-to-end RecognitionICDAR 2013
Strong F-Measure85
8
End-to-End Text SpottingStreet View Text (SVT)
Max F-measure55.7
7
End-to-End Text SpottingStreet View Text SVT-50 Constrained lexicon
Maximum F1-Score68
7
Word SpottingICDAR 2013 (test)
Generic Metric (G)84.7
6
Showing 8 of 8 rows

Other info

Code

Follow for update