Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

SQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Visual Question AnsweringSQA
Accuracy93.42
41
Science Question AnsweringSQA-I
Score79
38
Science Question AnsweringSQA IMG
Accuracy70.7
37
Science Question AnsweringSQA
Accuracy (SQA)98
33
Sequential Question AnsweringSQA (test)
Accuracy (All)74.5
33
Science Question AnsweringSQA
SQA Score96.6
26
Visual Question AnsweringSQA-Image
Accuracy70.2
25
Question AnsweringSQA
Accuracy79.62
24
ReasoningSQA
Accuracy85
23
Science Question AnsweringSQA IMG
Score97.67
23
Science Question AnsweringSQA
SQA Score97
22
Image-Language UnderstandingSQA
EM71.6
21
Science Question AnsweringSQA
SQA Score69.91
19
Deep ResearchSQA v2
Score88.3
18
Science Question AnsweringSQA
Exact Match98.76
14
Science Question AnsweringSQA
Score69.3
13
Science Question AnsweringSQA-I
Accuracy67.9
13
Search-based Question AnsweringSQA CS V2 (test)
IR57.62
12
Table Question AnsweringSQA (test)
Accuracy (All)72.4
11
Table Question AnsweringSQA Perturbed (test)
Overall Accuracy0.723
8
Scholarly Question AnsweringSQA CS V2
Overall Score89.7
6
3D Visual Question AnsweringSQA (test)
EM@153.32
5
Sequential Question AnsweringSQA
Overall Accuracy74.5
5
Sequential Question AnsweringSQA first fold (dev)
Accuracy (ALL)68
5
Scholarly QASQA v2
Score41.8
4
Showing 25 of 30 rows