Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
PDF Parsing on Internal PDF Parsing Dataset
Loading...
99
Text Extraction Accuracy
Proposed
96.92
97.46
98
98.54
Apr 25, 2026
Text Extraction Accuracy
Caption Similarity
Table Bounding Box Accuracy (Tbl BBA)
Image Bounding Box Accuracy (Img BBA)
Form Bounding Box Accuracy (Frm BBA)
Overall Detection Completeness (DC)
Table Detection Completeness (Tbl DC)
Image Detection Completeness (Img DC)
Form Detection Completeness (Frm DC)
Updated 1mo ago
Evaluation Results
Method
Method
Links
Text Extraction Accuracy
Caption Similarity
Table Bounding Box Accuracy (Tbl BBA)
Image Bounding Box Accuracy (Img BBA)
Form Bounding Box Accuracy (Frm BBA)
Overall Detection Completeness (DC)
Table Detection Completeness (Tbl DC)
Image Detection Completeness (Img DC)
Form Detection Completeness (Frm DC)
Proposed
2026.04
99
93
96
98
100
100
100
100
100
PyMuPDF
2026.04
98
75
76
95
-
82
79
91
-
Tika
2026.04
98
-
-
82
-
69
-
69
-
PDFPlumber
2026.04
97
-
81
-
-
88
88
-
-
unstructured
2026.04
97
-
82
86
-
73
85
72
-
PDFMiner
2026.04
97
-
-
96
-
91
-
91
-
Feedback
Search any
task
Search any
task