Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

End-to-end Handwritten Paragraph Text Recognition Using a Vertical Attention Network

About

Unconstrained handwritten text recognition remains challenging for computer vision systems. Paragraph text recognition is traditionally achieved by two models: the first one for line segmentation and the second one for text line recognition. We propose a unified end-to-end model using hybrid attention to tackle this task. This model is designed to iteratively process a paragraph image line by line. It can be split into three modules. An encoder generates feature maps from the whole paragraph image. Then, an attention module recurrently generates a vertical weighted mask enabling to focus on the current text line features. This way, it performs a kind of implicit line segmentation. For each text line features, a decoder module recognizes the character sequence associated, leading to the recognition of a whole paragraph. We achieve state-of-the-art character error rate at paragraph level on three popular datasets: 1.91% for RIMES, 4.45% for IAM and 3.59% for READ 2016. Our code and trained model weights are available at https://github.com/FactoDeepLearning/VerticalAttentionOCR.

Denis Coquenet, Cl\'ement Chatelain, Thierry Paquet• 2020

Related benchmarks

TaskDatasetResultRank
Handwritten text recognitionIAM (test)
CER5
102
Handwritten text recognitionIAM-A (test)
CER (%)4.45
24
Handwritten text recognitionREAD 2016 (test)
CER3.59
23
Handwritten text recognitionIAM Aachen (test)
CER4.45
23
Handwritten text recognitionRIMES (test)
CER1.91
15
Handwriting RecognitionIAM page paragraph
CER4.5
6
Handwritten text recognitionIAM-B (test)
CER4.32
6
Handwritten text recognitionREAD 2016
CER4.1
6
Handwritten text recognitionRIMES line level (test)
CER3.04
5
Handwritten Document RecognitionREAD Line level 2016 (test)
CER4.1
4
Showing 10 of 12 rows

Other info

Follow for update