Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PosFormer: Recognizing Complex Handwritten Mathematical Expression with Position Forest Transformer

About

Handwritten Mathematical Expression Recognition (HMER) has wide applications in human-machine interaction scenarios, such as digitized education and automated offices. Recently, sequence-based models with encoder-decoder architectures have been commonly adopted to address this task by directly predicting LaTeX sequences of expression images. However, these methods only implicitly learn the syntax rules provided by LaTeX, which may fail to describe the position and hierarchical relationship between symbols due to complex structural relations and diverse handwriting styles. To overcome this challenge, we propose a position forest transformer (PosFormer) for HMER, which jointly optimizes two tasks: expression recognition and position recognition, to explicitly enable position-aware symbol feature representation learning. Specifically, we first design a position forest that models the mathematical expression as a forest structure and parses the relative position relationships between symbols. Without requiring extra annotations, each symbol is assigned a position identifier in the forest to denote its relative spatial position. Second, we propose an implicit attention correction module to accurately capture attention for HMER in the sequence-based decoder architecture. Extensive experiments validate the superiority of PosFormer, which consistently outperforms the state-of-the-art methods 2.03%/1.22%/2.00%, 1.83%, and 4.62% gains on the single-line CROHME 2014/2016/2019, multi-line M2E, and complex MNE datasets, respectively, with no additional latency or computational cost. Code is available at https://github.com/SJTU-DeepVisionLab/PosFormer.

Tongkun Guan, Chengyu Lin, Wei Shen, Xiaokang Yang• 2024

Related benchmarks

TaskDatasetResultRank
Handwritten Mathematical Expression RecognitionCROHME 2016 (test)
Expression Rate (Exp)61.03
164
Handwritten Mathematical Expression RecognitionCROHME 2014 (test)
Expression Rate (Exp)62.74
156
Handwritten Mathematical Expression RecognitionCROHME 2019 (test)
Expression Rate (Exp)64.97
107
Handwritten Mathematical Expression RecognitionHME100K
ExpRate69.51
17
Handwritten Mathematical Expression RecognitionM2E multi-line (test)
ExpRate58.33
8
Handwritten Mathematical Expression RecognitionMNE N3 (test)
Expression Rate36.82
6
Expression RecognitionMNE N3 (test)
Expression Error Rate36.82
6
Expression RecognitionMNE N1 (test)
Expression Rate60.59
6
Expression RecognitionMNE N2 (test)
Recognition Rate0.3882
6
Handwritten Mathematical Expression RecognitionMNE N1 (test)
ExpRate60.59
6
Showing 10 of 12 rows

Other info

Code

Follow for update