Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

WikiTQ

Benchmarks

Task NameDataset NameSOTA ResultTrend
Table Question AnsweringWikiTQ (test)
Accuracy84.3
130
Table Question AnsweringWikiTQ
Accuracy91.84
118
Table Question AnsweringWikiTQ
F1 Score79.8
50
Out-of-Distribution Agent ReasoningWikiTQ (OOD)
OOD WikiTQ Score74.68
30
Table Question AnsweringWikiTQ
Accuracy77.2
29
Table Question AnsweringWikiTQ (dev)
Denotation Acc58.3
18
NL-to-SQLWikiTQ
Execution Accuracy59.7
12
Language-to-Code GenerationWikiTQ official (test)
Execution Accuracy74.6
12
Table ReasoningWikiTQ
Exact Match (EM)79.4
11
Language-to-Code GenerationWikiTQ official (dev)
Execution Accuracy70.9
11
Table Question AnsweringWikiTQ (challenge-set)
F1 Score74.41
10
Table ReasoningWikiTQ (T)
Accuracy65.56
10
Table ReasoningWikiTQ (D)
Accuracy0.651
10
Table Question AnsweringWikiTQ Large (>4000 tokens)
Accuracy57
8
NL-to-ScalaWikiTQ
Execution Accuracy47.6
6
Table Question AnsweringWikiTQ
BLEU63.19
5
Showing 16 of 16 rows