TextBoxes: A Fast Text Detector with a Single Deep Neural Network

About

This paper presents an end-to-end trainable fast scene text detector, named TextBoxes, which detects scene text with both high accuracy and efficiency in a single network forward pass, involving no post-process except for a standard non-maximum suppression. TextBoxes outperforms competing methods in terms of text localization accuracy and is much faster, taking only 0.09s per image in a fast implementation. Furthermore, combined with a text recognizer, TextBoxes significantly outperforms state-of-the-art approaches on word spotting and end-to-end text recognition tasks.

Minghui Liao, Baoguang Shi, Xiang Bai, Xinggang Wang, Wenyu Liu• 2016

Related benchmarks

Task	Dataset	Result
Text Detection	Total-Text	Precision62.1	160
Text Detection	Total-Text (test)	F-Measure52.5	126
Scene Text Detection	TotalText (test)	Recall45.5	106
Scene Text Spotting	Total-Text (test)	F-measure (None)36.3	105
Text Detection	ICDAR 2013 (test)	F1 Score86	88
Text Localization	ICDAR 2013 (test)	Recall83	28
End-to-End Text Spotting	ICDAR 2013 (test)	Score S93.9	25
End-to-end Recognition	Total-Text	F1 Score36.3	22
End-to-End Text Recognition	Total-Text (test)	F-measure (None)36.3	17
Word Spotting	ICDAR 2013	Generic Score87	12

Showing 10 of 15 rows

Other info

Follow for update

@wizwand_team Discord