Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Text-oriented Visual Question Answering on WTQ
Loading...
71.3
Accuracy
Visual-SR1
13.892
28.796
43.7
58.604
Apr 10, 2024
Jul 7, 2024
Oct 4, 2024
Jan 1, 2025
Mar 31, 2025
Jun 28, 2025
Sep 25, 2025
Accuracy
Updated 13d ago
Evaluation Results
Method
Method
Links
Accuracy
Visual-SR1
Backbone=Qwen2.5-VL-7B
2025.09
71.3
DeFacto
Backbone=Qwen2.5-VL-7B
2025.09
63.7
Qwen2.5-VL
Backbone=Qwen2.5-VL-7B
2025.09
62.3
ViCrop
Backbone=LLaVA-1.5 (Vi...
2025.09
51.5
DeepEyes
Backbone=Qwen2.5-VL-7B
2025.09
51.3
GRIT
Backbone=Qwen2.5-VL-3B
2025.09
32
HRVDA
Resolution=1536
2024.04
31.2
UReader
Resolution=224
2024.04
29.4
mPLUG-Doc
Resolution=224
2024.04
26.9
SeRum
Resolution=1280, Fine-...
2024.04
25.5
Donut
Resolution=1280, Fine-...
2024.04
18.8
Qwen-VL
Resolution=448, Open-s...
2024.04
16.1
Feedback
Search any
task
Search any
task