Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

2WikiQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-hop Question Answering2WikiQA (test)
F174.3
71
Multi-hop Question Answering2WikiQA OOD evaluation
EM38.76
6
Multi-hop Question Answering2WikiQA
Latency (s)0.44
6
Retrieval2WikiQA
Average Running Time (s)0.15
5
Question Answering2WikiQA
EM53.1
4
Question Answering2WikiQA (test)
EM57.9
3
Showing 6 of 6 rows