Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

Standard QA Benchmarks

Benchmarks

Task NameDataset NameSOTA ResultTrend
Question AnsweringStandard QA Benchmarks (2WikiMultiHopQA, HotpotQA, Bamboogle, MuSiQue, Natural Questions, TriviaQA, PopQA) (test)
2WikiMultiHopQA Pass@180.4
11
Open-domain Question AnsweringStandard QA Benchmarks Average
Avg@461
9
Showing 2 of 2 rows