PaddleOCR 3.0 Technical Report

About

This technical report introduces PaddleOCR 3.0, an Apache-licensed open-source toolkit for OCR and document parsing. To address the growing demand for document understanding in the era of large language models, PaddleOCR 3.0 presents three major solutions: (1) PP-OCRv5 for multilingual text recognition, (2) PP-StructureV3 for hierarchical document parsing, and (3) PP-ChatOCRv4 for key information extraction. Compared to mainstream vision-language models (VLMs), these models with fewer than 100 million parameters achieve competitive accuracy and efficiency, rivaling billion-parameter VLMs. In addition to offering a high-quality OCR model library, PaddleOCR 3.0 provides efficient tools for training, inference, and deployment, supports heterogeneous hardware acceleration, and enables developers to easily build intelligent document applications.

Cheng Cui, Ting Sun, Manhui Lin, Tingquan Gao, Yubo Zhang, Jiaxuan Liu, Xueqing Wang, Zelun Zhang, Changda Zhou, Hongen Liu, Yue Zhang, Wenyu Lv, Kui Huang, Yichao Zhang, Jing Zhang, Jun Zhang, Yi Liu, Dianhai Yu, Yanjun Ma• 2025

Related benchmarks

Task	Dataset	Result
Document Parsing	OmniDocBench v1.5	Overall Score86.73	195
Document Parsing	OmniDocBench 1.5 (test)	Text Edit Error0.021	132
Document Parsing	Real5-OmniDocBench (screen-photography)	Overall Score66.89	32
Document Parsing	OmniDocBench Real5 warping	Overall Score59.34	32
Document Parsing	OmniDocBench Real5 skewing variation	Overall Score37.98	32
Document Parsing	OmniDocBench Real5	Score73.38	26
Document reading	LogicsDocBench	Overall Score70.26	20
Document Understanding	OmniDocBench standard (test)	Overall Score86.73	19
Document Parsing	Real5-OmniDocBench scanning scenario 1.5 (test)	Overall Score84.68	19
Document Parsing	Real5-OmniDocBench 5-distortion types (test)	Overall Accuracy64.45	19

Showing 10 of 30 rows

Other info

Follow for update

@wizwand_team Discord