Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

PaddleOCR-VL: Boosting Multilingual Document Parsing via a 0.9B Ultra-Compact Vision-Language Model

About

In this report, we propose PaddleOCR-VL, a SOTA and resource-efficient model tailored for document parsing. Its core component is PaddleOCR-VL-0.9B, a compact yet powerful vision-language model (VLM) that integrates a NaViT-style dynamic resolution visual encoder with the ERNIE-4.5-0.3B language model to enable accurate element recognition. This innovative model efficiently supports 109 languages and excels in recognizing complex elements (e.g., text, tables, formulas, and charts), while maintaining minimal resource consumption. Through comprehensive evaluations on widely used public benchmarks and in-house benchmarks, PaddleOCR-VL achieves SOTA performance in both page-level document parsing and element-level recognition. It significantly outperforms existing solutions, exhibits strong competitiveness against top-tier VLMs, and delivers fast inference speeds. These strengths make it highly suitable for practical deployment in real-world scenarios. Code is available at https://github.com/PaddlePaddle/PaddleOCR .

Cheng Cui, Ting Sun, Suyin Liang, Tingquan Gao, Zelun Zhang, Jiaxuan Liu, Xueqing Wang, Changda Zhou, Hongen Liu, Manhui Lin, Yue Zhang, Yubo Zhang, Handong Zheng, Jing Zhang, Jun Zhang, Yi Liu, Dianhai Yu, Yanjun Ma• 2025

Related benchmarks

TaskDatasetResultRank
Document ParsingOmniDocBench v1.5
Overall Score92.86
126
Document ParsingolmOCR-bench
ArXiv Processing Accuracy85.7
36
Reading Order DetectionOmniDocBench ZH v1.0
Edit Distance0.063
28
Reading Order DetectionOmniDocBench EN v1.0
Edit Distance0.045
28
Document ParsingOmniDocBench 1.5 (test)
Overall Score92.86
27
Reading Order DetectionOmniDocBench v1.5
Edit Distance0.043
21
Document ParsingReal5-OmniDocBench scanning scenario 1.5 (test)
Overall Score92.11
19
Document ParsingOmniDocBench Real5 illumination
Overall Score0.8961
19
Document ParsingOmniDocBench Real5 warping
Overall Score85.97
19
Document ParsingReal5-OmniDocBench 5-distortion types (test)
Overall Accuracy85.54
19
Showing 10 of 37 rows

Other info

Follow for update