Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

TyDi QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringTyDi QA
Accuracy69.4
43
Context AttributionTyDi QA random subset of 10,000 samples
Log-Probability Drop0.893
12
Attribution Quality EvaluationTyDi QA
Log-Prob Drop0.107
12
Question AnsweringTyDi QA No-context
F1 (Arabic)42.6
4
Question AnsweringTyDi QA Gold Passage
Arabic F173.8
4
Question AnsweringTyDi QA less-mix (test)
F127.4
3
Showing 6 of 6 rows