Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

STVQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Scene Text-Centric Visual Question AnsweringSTVQA
Accuracy0.759
20
Spatial ReasoningSTVQA 300 samples 7k (train)
Relative Score88.5
13
Spatial ReasoningSTVQA-7k (test)
Relative Position Accuracy79.3
6
Visual Question AnsweringSTVQA-7k
Relation Acc86.4
6
Showing 4 of 4 rows