Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

DeepSeek-OCR 2: Visual Causal Flow

About

We present DeepSeek-OCR 2 to investigate the feasibility of a novel encoder-DeepEncoder V2-capable of dynamically reordering visual tokens upon image semantics. Conventional vision-language models (VLMs) invariably process visual tokens in a rigid raster-scan order (top-left to bottom-right) with fixed positional encoding when fed into LLMs. However, this contradicts human visual perception, which follows flexible yet semantically coherent scanning patterns driven by inherent logical structures. Particularly for images with complex layouts, human vision exhibits causally-informed sequential processing. Inspired by this cognitive mechanism, DeepEncoder V2 is designed to endow the encoder with causal reasoning capabilities, enabling it to intelligently reorder visual tokens prior to LLM-based content interpretation. This work explores a novel paradigm: whether 2D image understanding can be effectively achieved through two-cascaded 1D causal reasoning structures, thereby offering a new architectural approach with the potential to achieve genuine 2D reasoning. Codes and model weights are publicly accessible at http://github.com/deepseek-ai/DeepSeek-OCR-2.

Haoran Wei, Yaofeng Sun, Yukun Li• 2026

Related benchmarks

TaskDatasetResultRank
Optical Character RecognitionOCRBench
Score459
232
Document ParsingOmniDocBench v1.5
Overall Score91.09
195
Document ParsingOmniDocBench 1.5 (test)
Text Edit Error0.048
111
Document RecognitionOmniDocBench
Overall Score91.09
29
Document readingLogicsDocBench
Overall Score77.04
20
Optical Character RecognitionOCRBench v2
Accuracy (en)15.98
14
Molecular structure recognitionUOB Synthetic
Exact Matching Accuracy0.2
11
Document ParsingOlmOCR Bench v1.0 (test)
Overall Score77.2
10
Document UnderstandingFireRedBench (test)
Overall Score0.6161
10
Molecular structure recognitionCLEF Synthetic
Exact Match Accuracy0.00e+0
10
Showing 10 of 25 rows

Other info

GitHub

Follow for update