Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Recursive Recurrent Nets with Attention Modeling for OCR in the Wild

About

We present recursive recurrent neural networks with attention modeling (R$^2$AM) for lexicon-free optical character recognition in natural scene images. The primary advantages of the proposed method are: (1) use of recursive convolutional neural networks (CNNs), which allow for parametrically efficient and effective image feature extraction; (2) an implicitly learned character-level language model, embodied in a recurrent neural network which avoids the need to use N-grams; and (3) the use of a soft-attention mechanism, allowing the model to selectively exploit image features in a coordinated way, and allowing for end-to-end training within a standard backpropagation framework. We validate our method with state-of-the-art performance on challenging benchmark datasets: Street View Text, IIIT5k, ICDAR and Synth90k.

Chen-Yu Lee, Simon Osindero• 2016

Related benchmarks

TaskDatasetResultRank
Scene Text RecognitionSVT (test)
Word Accuracy96.3
289
Scene Text RecognitionIIIT5K (test)
Word Accuracy96.8
244
Scene Text RecognitionIC13 (test)
Word Accuracy90
207
Scene Text RecognitionIIIT5K
Accuracy96.8
149
Scene Text RecognitionSVT 647 (test)
Accuracy82.4
101
Text RecognitionStreet View Text (SVT)
Accuracy96.3
80
Scene Text RecognitionIC03
Accuracy97.9
67
Scene Text RecognitionSVT
Accuracy80.7
67
Scene Text RecognitionIC13
Accuracy90
66
Scene Text RecognitionIC03 (test)
Accuracy97.9
63
Showing 10 of 38 rows

Other info

Follow for update