Share your thoughts, 1 month free Claude Pro on usSee more

Scene Text-Centric Visual Question Answering on STVQA

0.759Accuracy

Visual-SR1

Updated 2mo ago

Evaluation Results

Method	Links
Visual-SR1 2025.09		0.759
ViCrop 2025.09		0.724
DeFacto 2025.09		0.712
Qwen2.5-VL 2025.09		0.679
GRIT 2025.09		0.647
InternVL 2024.07		0.622
InternLM-XComposer2 2024.07		0.596
Monkey 2024.07		0.547
TextHarmony-Chat 2024.07		0.513
mPLUG-Owl2 2024.07		0.498
TextHarmony 2024.07		0.497
TextHarmony* 2024.07		0.472
DeepEyes 2025.09		0.459
DocPedia 2024.07		0.455
LLaVAR 2024.07		0.392
LLaVA1.5-7B 2024.07		0.381
UniDoc 2024.07		0.352
MM-Interleaved 2024.07		0.264
SEED-LLaMA-14B 2024.07		0.201
MiniGPT5 2024.07		0.024