Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringTQA (test)
AUROC90.2
90
Question AnsweringTQA
Absolute Execution Time Overhead (s)0.173
90
Question AnsweringTQA
PRR86.1
90
Question AnsweringTQA
Accuracy92.3
80
Question AnsweringTQA
Accuracy76.8
60
Table Question AnsweringTQA FinQA, HiTab, TAT-QA, TabMWP, WTQ
FinQA Accuracy40.48
20
Question AnsweringTQA Poison Attack (test)
Accuracy75.6
18
Question AnsweringTQA PIA Attack (test)
Accuracy76.4
18
Knowledge gap detectionTQA
Accuracy83.2
18
Question AnsweringTQA poison @ Position 10, k=10 (test)
Robustness Accuracy71
15
Question AnsweringTQA poison @ Position 1, k=10 (test)
Robustness Accuracy66.4
15
Question AnsweringTQA
EM42.12
14
Visual Question AnsweringTQA
Accuracy77.5
13
Inference EfficiencyTQA
Relative Execution Time Overhead0.05
12
Open-Domain Question AnsweringTQA (test)
EM66.45
11
Visual ReasoningTQA
Accuracy86.7
8
Open-Domain Question AnsweringTQA
Accuracy71.4
8
Information RetrievalTQA (test)
Recall@578.3
8
Retrieval-Augmented GenerationTQA open
Accuracy46.24
8
Textbook Question AnsweringTQA (test)
Accuracy86.7
7
RetrievalTQA
NDCG@1059.02
6
Question AnsweringTQA Benign (test)
Accuracy76.4
6
Context Compression & QATQA (val)
EM59.7
6
Open-Domain Question AnsweringTQA
R@137.39
4
Open-Domain Question AnsweringTQA
Exact Match (EM)75.23
3
Showing 25 of 25 rows