Our new X account is live! Follow @wizwand_team for updates
WorkDL logo mark

2WikiMHQA

Benchmarks

Task NameDataset NameSOTA ResultTrend
Multi-hop Question Answering2WikiMHQA
F1 Score85.56
55
Multi-hop Reasoning2WikiMHQA
AUROC0.7002
26
Multi-hop Question Answering2WikiMHQA (test)
EM71.85
17
Multi-hop Question Answering2WikiMHQA in-distribution
Exact Match (EM)79.39
17
Multi-hop Question Answering2WikiMHQA in-distribution v4 (test)
EM-
0
Showing 5 of 5 rows