Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Document Understanding on DUDE
Loading...
61.8
Accuracy
Qwen3 VL
50.984
53.792
56.6
59.408
Feb 16, 2026
Feb 23, 2026
Mar 2, 2026
Mar 9, 2026
Mar 16, 2026
Mar 23, 2026
Mar 31, 2026
Accuracy
Updated 12d ago
Evaluation Results
Method
Method
Links
Accuracy
Qwen3 VL
Checkpoint=32B
2026.02
61.8
Qwen3 VL 32B Instruct
Model family=Qwen3 VL
2026.03
61.8
Qwen3 VL
Checkpoint=235B A22B
2026.02
59.1
Qwen3 VL 235B A22B Instruct
Model family=Qwen3 VL
2026.03
59.1
LongPO
Checkpoint=Short Stage
2026.02
56
LongPO
Model family=Qwen3 VL
2026.03
56
No-think
Model family=Qwen3 VL
2026.03
55.3
Synthetic Reasoning
Model family=Qwen3 VL
2026.03
55.1
Qwen3 VL Plain Distillation
Checkpoint=Short Stage
2026.02
54.8
Plain Distillation
Model family=Qwen3 VL
2026.03
54.8
Synthetic Reasoning
Model family=Mistral
2026.03
54.1
Mistral Plain Distillation*
2026.02
54
Plain Distillation
Model family=Mistral
2026.03
54
Qwen Thinking Traces
Model family=Mistral
2026.03
54
Mistral 3.1 Small
Checkpoint=24B
2026.02
52.8
Mistral 3.1 Small 24B
Model family=Mistral
2026.03
52.8
No-think
Model family=Mistral
2026.03
51.4
Feedback
Search any
task
Search any
task