Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringTQA (test)
AUROC90.2
90
Question AnsweringTQA
Absolute Execution Time Overhead (s)0.173
90
Question AnsweringTQA
PRR86.1
90
Question AnsweringTQA
Accuracy92.3
74
Knowledge gap detectionTQA
Accuracy83.2
18
Question AnsweringTQA poison @ Position 10, k=10 (test)
Robustness Accuracy71
15
Question AnsweringTQA poison @ Position 1, k=10 (test)
Robustness Accuracy66.4
15
Inference EfficiencyTQA
Relative Execution Time Overhead0.05
12
Open-Domain Question AnsweringTQA (test)
EM66.45
11
Information RetrievalTQA (test)
Recall@578.3
8
Retrieval-Augmented GenerationTQA open
Accuracy46.24
8
Context Compression & QATQA (val)
EM59.7
6
Showing 12 of 12 rows