Open-set Text Recognition via Character-Context Decoupling
About
The open-set text recognition task is an emerging challenge that requires an extra capability to cognize novel characters during evaluation. We argue that a major cause of the limited performance for current methods is the confounding effect of contextual information over the visual information of individual characters. Under open-set scenarios, the intractable bias in contextual information can be passed down to visual information, consequently impairing the classification performance. In this paper, a Character-Context Decoupling framework is proposed to alleviate this problem by separating contextual information and character-visual information. Contextual information can be decomposed into temporal information and linguistic information. Here, temporal information that models character order and word length is isolated with a detached temporal attention module. Linguistic information that models n-gram and other linguistic statistics is separated with a decoupled context anchor mechanism. A variety of quantitative and qualitative experiments show that our method achieves promising performance on open-set, zero-shot, and close-set text recognition datasets.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Scene Text Recognition | IIIT5K | Accuracy91.9 | 149 | |
| Scene Text Recognition | CUTE | Accuracy83.68 | 92 | |
| Scene Text Recognition | IC03 | Accuracy92.38 | 67 | |
| Scene Text Recognition | SVT | Accuracy85.93 | 67 | |
| Scene Text Recognition | IC13 | Accuracy92.21 | 66 | |
| Character Recognition | HWDB | Accuracy95.55 | 24 | |
| Character Recognition | CTW | Accuracy77.18 | 20 | |
| Scene Text Recognition | IC 03 | Accuracy (Full Lexicon)96.9 | 15 | |
| Text Recognition | IIIT5K | Accuracy (small)99.8 | 6 | |
| Open-set Text Recognition | MLT Japanese 2019 (test) | Character Accuracy (Overall)65.34 | 4 |