Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Question Generation on HotpotQA Full Document Context (test)
Loading...
23.33
BLEU-4
DPKGsoft
12.7948
15.5299
18.265
21.0001
May 21, 2025
BLEU-4
METEOR
ROUGE-L
Updated 4d ago
Evaluation Results
Method
Method
Links
BLEU-4
METEOR
ROUGE-L
DPKGsoft
Input Setting=Full Doc...
2025.05
23.33
25.21
43.18
DPKGhard
Input Setting=Full Doc...
2025.05
22.74
24.9
43.29
SGCM
Input Setting=Full Doc...
2025.05
22.61
26.04
40.61
MixQG
Input Setting=Full Doc...
2025.05
22.13
23.78
41.21
CQG
Input Setting=Full Doc...
2025.05
21.46
24.97
39.61
QA4QG-large
Input Setting=Full Doc...
2025.05
21.21
25.53
42.44
TS-BART
Input Setting=Full Doc...
2025.05
19.89
22.28
41.33
BART
Input Setting=Full Doc...
2025.05
16.77
20.07
33.69
MulQG
Input Setting=Full Doc...
2025.05
13.2
20.31
35.3
Feedback
Search any
task
Search any
task