PaddleOCR 3.0 Technical Report
About
This technical report introduces PaddleOCR 3.0, an Apache-licensed open-source toolkit for OCR and document parsing. To address the growing demand for document understanding in the era of large language models, PaddleOCR 3.0 presents three major solutions: (1) PP-OCRv5 for multilingual text recognition, (2) PP-StructureV3 for hierarchical document parsing, and (3) PP-ChatOCRv4 for key information extraction. Compared to mainstream vision-language models (VLMs), these models with fewer than 100 million parameters achieve competitive accuracy and efficiency, rivaling billion-parameter VLMs. In addition to offering a high-quality OCR model library, PaddleOCR 3.0 provides efficient tools for training, inference, and deployment, supports heterogeneous hardware acceleration, and enables developers to easily build intelligent document applications.
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Document Parsing | OmniDocBench v1.5 | Overall Score86.73 | 195 | |
| Document Parsing | OmniDocBench 1.5 (test) | Text Edit Error0.021 | 111 | |
| Document Parsing | Real5-OmniDocBench (screen-photography) | Overall Score66.89 | 32 | |
| Document Parsing | OmniDocBench Real5 warping | Overall Score59.34 | 32 | |
| Document Parsing | OmniDocBench Real5 skewing variation | Overall Score37.98 | 32 | |
| Document Parsing | OmniDocBench Real5 | Score73.38 | 26 | |
| Document reading | LogicsDocBench | Overall Score70.26 | 20 | |
| Document Understanding | OmniDocBench standard (test) | Overall Score86.73 | 19 | |
| Document Parsing | Real5-OmniDocBench scanning scenario 1.5 (test) | Overall Score84.68 | 19 | |
| Document Parsing | Real5-OmniDocBench 5-distortion types (test) | Overall Accuracy64.45 | 19 |