Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Text QA on HotpotQA Full v1.1 (train)
Loading...
82.2
F1 Score
HGN
71.488
74.269
77.05
79.831
Sep 10, 2021
F1 Score
Top-2 Recall
Top-3 Recall
Exact Match (EM)
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Top-2 Recall
Top-3 Recall
Exact Match (EM)
HGN
Backbone=RoBERTa-Large
2021.09
82.2
-
-
-
ReasonBERT_R
Backbone=RoBERTa-base
2021.09
78.1
94
98
64.8
ReasonBERT_B
Backbone=BERT-base
2021.09
77.2
93.8
97.8
63.4
Splinter
Backbone=Splinter-base
2021.09
76.5
94.1
97.9
62.5
RoBERTa
Backbone=RoBERTa-base
2021.09
76.3
93.1
97.5
62.9
SpanBERT
Backbone=SpanBERT-base
2021.09
76.3
93.6
97.7
62.9
SSPT
Backbone=SSPT-base
2021.09
75.4
93.9
97.9
61.5
HGN
Backbone=BERT-base
2021.09
74.8
-
-
-
BERT
Backbone=BERT-base
2021.09
71.9
92.4
96.9
57.9
Feedback
Search any
task
Search any
task