Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TextVQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Question AnsweringTextVQA
Accuracy85.4
1,453
Text-based Visual Question AnsweringTextVQA
Accuracy88.5
962
Visual Question AnsweringTextVQA (val)
VQA Score7,040
365
Text-based Visual Question AnsweringTextVQA (val)
Accuracy86.5
276
Visual Question AnsweringTextVQA
TextVQA Accuracy85.9
210
Visual Question AnsweringTextVQA (test)
Accuracy81.1
124
Text-based Visual Question AnsweringTextVQA
Score67.32
112
Text-based Visual Question AnsweringTextVQA (VQA^T)
Accuracy78
96
Visual Question AnsweringTextVQA
Accuracy88.7
94
Visual Question AnsweringTextVQA v1.0 (val)
Accuracy85.5
84
Visual Question AnsweringTextVQA
Accuracy97.15
79
OCR-related Understanding TasksTextVQA (val)
Accuracy86.62
64
Text-based Visual Question AnsweringTextVQA VQAT
Accuracy69.74
61
Text-based Visual Question AnsweringTextVQA
Score85.2
60
Text-based Visual Question AnsweringTextVQA
Accuracy61.3
58
OCR Visual Question AnsweringTextVQA
Accuracy83.69
57
Image UnderstandingTextVQA
Accuracy725
43
Visual Question Answering on TextTextVQA
Accuracy58.21
41
Visual Question AnsweringTextVQA v1.0 (test)
Accuracy86.79
40
Visual Question AnsweringTextVQA
Accuracy97
38
Visual Question AnsweringTextVQA
Clean Accuracy70.3
37
Text-based Visual Question AnsweringTextVQA
TextVQA Accuracy73.78
33
Text-based Visual Question AnsweringTextVQA
ANLS60.7
33
Visual Question AnsweringTextVQA
VQA Accuracy39
33
Multimodal Prompt Injection AttackTextVQA
Attack Success Rate (ASR)88.24
30
Showing 25 of 82 rows