Docling Technical Report
About
This technical report introduces Docling, an easy to use, self-contained, MIT-licensed open-source package for PDF document conversion. It is powered by state-of-the-art specialized AI models for layout analysis (DocLayNet) and table structure recognition (TableFormer), and runs efficiently on commodity hardware in a small resource budget. The code interface allows for easy extensibility and addition of new features and models.
Christoph Auer, Maksym Lysak, Ahmed Nassar, Michele Dolfi, Nikolaos Livathinos, Panos Vagenas, Cesar Berrospi Ramis, Matteo Omenetti, Fabian Lindlbauer, Kasper Dinkla, Lokesh Mishra, Yusik Kim, Shubham Gupta, Rafael Teixeira de Lima, Valery Weber, Lucas Morin, Ingmar Meijer, Viktor Kuropiatnyk, Peter W. J. Staar• 2024
Related benchmarks
| Task | Dataset | Result | Rank | |
|---|---|---|---|---|
| Document Text Generation | OHR-Bench Generation | Text Score46.6 | 14 | |
| Textual RAG | OHR-Bench (Overall) | TXT Score0.43 | 14 | |
| Document Retrieval | OHR-Bench Retrieval | Accuracy (Text)73.3 | 14 | |
| Visual RAG | BizMMRAG | Score (TXT)66.7 | 5 | |
| Textual RAG | BizMMRAG Japanese (test) | TXT Score70 | 5 | |
| Textual RAG | Allganize Japanese (test) | TXT Score61.3 | 5 | |
| Visual RAG | OHR-Bench (test) | TXT Score73.2 | 5 | |
| Visual RAG | Allganize | TXT Score66 | 5 |
Showing 8 of 8 rows