Share your thoughts, 1 month free Claude Pro on usSee more
WorkDL logo mark

Single-hop QA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Single-hop Question AnsweringSingle-hop QA Average
F1 Score59.61
35
Single-Hop Question AnsweringSingle-Hop QA NQ, TriviaQA, PopQA
NQ Score48.5
13
Question AnsweringSingle-hop QA (NQ, PopQA, AmbigQA) (test)
F1 (NQ)59.45
9
Showing 3 of 3 rows