Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

NewsQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringNewsQA (dev)
F1 Score75.5
101
Question AnsweringNewsQA
F180.5
49
Uncertainty EstimationNewsQA
AUROC76.6
42
Membership Inference AttacknewsQA
AUC92.9
39
Question AnsweringNewsQA (test)
F173.6
31
Extractive Question AnsweringNewsQA MRQA
F172.6
22
Question AnsweringNewsQA trained on SQuAD OOD (test)
F1 Score52.41
20
Question AnsweringNewsQA Short-doc
R@166.43
16
Extractive Question AnsweringNewsQA
F1 Score59.7
14
Question AnsweringNewsQA trained on CausalQA OOD (test)
F1 Score9.54
10
Sentence SelectionNewsQA (dev)
Accuracy94.6
7
Question AnsweringNewsQA
Score66.8
6
Question AnsweringNewsQA
Latency (s)112.36
3
Document RetrievalNewsQA
Recall@193.3
2
Showing 14 of 14 rows