Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

AdversarialQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Red-teamingAdversarialQA
ASR0
20
Question AnsweringAdversarialQA (val)
EM38.5
19
Question AnsweringAdversarialQA
F1 Score56.1
17
Question AnsweringAdversarialQA dBERT
Accuracy39.51
14
Question AnsweringAdversarialQA dRoberta
Accuracy28.05
10
Safety EvaluationADVERSARIALQA
Chinese Accuracy43.75
8
Domain Shift Extractive Question AnsweringSQuAD -> AdversarialQA (test)
ECE0.075
6
Question AnsweringAdversarialQA dBiDAF
Accuracy55.12
6
Showing 8 of 8 rows