Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Uni-MuMER: Unified Multi-Task Fine-Tuning of Vision-Language Model for Handwritten Mathematical Expression Recognition

About

Handwritten Mathematical Expression Recognition (HMER) remains a persistent challenge in Optical Character Recognition (OCR) due to the inherent freedom of symbol layouts and variability in handwriting styles. Prior methods have faced performance bottlenecks by proposing isolated architectural modifications, making them difficult to integrate coherently into a unified framework. Meanwhile, recent advances in pretrained vision-language models (VLMs) have demonstrated strong cross-task generalization, offering a promising foundation for developing unified solutions. In this paper, we introduce Uni-MuMER, which fully fine-tunes a VLM for the HMER task without modifying its architecture, effectively injecting domain-specific knowledge into a generalist framework. Our method integrates three data-driven tasks: Tree-Aware Chain-of-Thought (Tree-CoT) for structured spatial reasoning, Error-Driven Learning (EDL) for reducing confusion among visually similar characters, and Symbol Counting (SC) for improving recognition consistency in long expressions. Experiments on the CROHME and HME100K datasets show that Uni-MuMER achieves super state-of-the-art performance, outperforming the best lightweight specialized model SSAN by 16.31\% and the top-performing VLM Gemini2.5-flash by 24.42\% under zero-shot setting. Our datasets, models, and code are open-sourced at: {https://github.com/BFlameSwift/Uni-MuMER

Yu Li, Jin Jiang, Jianhua Zhu, Shuai Peng, Baole Wei, Yuxuan Zhou, Liangcai Gao• 2025

Related benchmarks

TaskDatasetResultRank
Handwritten Mathematical Expression RecognitionCROHME 2016 (test)
Expression Rate (Exp)77.94
164
Handwritten Mathematical Expression RecognitionCROHME 2014 (test)
Expression Rate (Exp)82.05
156
Multimodal ReasoningMMMU (val)
Accuracy48.67
114
Handwritten Mathematical Expression RecognitionCROHME 2019 (test)
Expression Rate (Exp)79.23
107
Mathematical ReasoningMathVista mini (test)
Accuracy51.1
67
Mathematical ReasoningMATH-Vision
Accuracy24.34
32
Handwritten Mathematical Expression RecognitionCROHME 2023 (test)
Expression Rate70.26
11
Handwritten Mathematical Expression RecognitionMathWriting (test)
CER4
9
Mathematical Expression Recognitionim2latex normalized v2 (test)
Edit Distance99.12
7
Expression RecognitionMNE N1 (test)
Expression Rate76
6
Showing 10 of 17 rows

Other info

Code

Follow for update