Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

CausalQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringCausalQA Industrial domain (test)
F1 Score85.84
15
Question AnsweringCausalQA Technology Setup 1 (test)
F185.54
15
Question AnsweringCausalQA Consumer Setup 1 (test)
F1 Score85.84
15
Question AnsweringCausalQA Technology domain (test)
F1 Score79
12
Question AnsweringCausalQA Consumer domain (test)
F180.55
12
Question AnsweringCausalQA Industrial Setup 1 (test)
F180.7
12
Question AnsweringCausalQA trained on NewsQA OOD (test)
F151.64
10
Question AnsweringCausalQA trained on SQuAD OOD (test)
F1 Score66.86
10
Question Answering RetrievalCausalQA HotpotQA
Hit@115.4
6
Question Answering RetrievalCausalQA SQuAD v2.0
Hit@129.3
6
Question Answering RetrievalCausalQA Natural Questions
Hit@115.7
6
Question Answering RetrievalCausalQA MS MARCO
Hit@121.3
6
Question AnsweringCausalQA (test)
F1 Score85.53
2
Question AnsweringCausalQA (dev)
F1 Score85.4
2
Showing 14 of 14 rows