Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Visual Question Answering on TextVQA (F1, Hallucination Rate, Budget Usage)
Loading...
72.9
F1 Score
PoP
64.372
66.586
68.8
71.014
Feb 27, 2026
F1 Score
Hallucination Rate
Average Budget Usage (B-bar)
Updated 1mo ago
Evaluation Results
Method
Method
Links
F1 Score
Hallucination Rate
Average Budget Usage (B-bar)
PoP
Reasoning protocol=PoP
2026.02
72.9
9.3
12.4
ProgVLM
Reasoning protocol=Pro...
2026.02
69.7
15.1
14.7
MM-ReAct
Reasoning protocol=MM-...
2026.02
69.2
15.6
15.6
M-CoT
Reasoning protocol=M-CoT
2026.02
66.9
19.8
0.25
Direct
Reasoning protocol=Direct
2026.02
64.7
22.3
0.25
Feedback
Search any
task
Search any
task