Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

GrailQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-Hop Knowledge Graph Question AnsweringGrailQA
Hits@194.4
68
Knowledge Base Question AnsweringGrailQA Freebase (test)
Hits@186.9
48
Knowledge Base Question AnsweringGrailQA
Hits@189.3
40
Knowledge Base Question AnsweringGrailQA (dev)
Overall F189.5
34
Knowledge Base Question AnsweringGrailQA v1.0 (test)
Overall EM77.45
33
Knowledge Base Question AnsweringGrailQA (test)
F191.76
27
Knowledge Graph Question AnsweringGrailQA (Overall)
Hits@186.4
20
Knowledge Base Question AnsweringGrailQA 500-sample (dev)
F1 Score84.7
18
Knowledge Graph Question AnsweringGrailQA Zero-shot
Hits@189.1
17
Knowledge Graph Question AnsweringGrailQA Compositional
Hits@180
17
Knowledge Graph Question AnsweringGrailQA I.I.D.
Hits@192
17
Knowledge Base Question AnsweringGrailQA Compositional (dev)
F1 Score87
14
Knowledge Base Question AnsweringGrailQA i.i.d. (dev)
F193.8
14
Knowledge Graph Question AnsweringGrailQA (test)
Overall Score84.7
14
Knowledge Base Question AnsweringGrailQA
Hits@180.4
11
Generalization Knowledge Base Question AnsweringGrailQA
Hit@189.3
9
Knowledge Base Question AnsweringGrailQA v1.0 (dev)
F183.4
9
Knowledge Base Question AnsweringGrailQAbility answerable zero-shot
F1 (L)78.01
8
Knowledge Base Question AnsweringGrailQAbility answerable (IID)
F1(L)89
8
Knowledge Graph Question AnsweringGrailQA IID
F1 Score92.4
6
Knowledge Base Question AnsweringGrailQAbility unanswerable questions Zero-Shot (test)
F1(R)88.31
4
Knowledge Base Question AnsweringGrailQAbility unanswerable questions (test IID)
F1(R)97.01
4
Knowledge Base Question AnsweringGrailQA (val)
Overall F183.33
4
Knowledge Base Question AnsweringGrailQA (hard)
EM51.5
3
Showing 24 of 24 rows