Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Multi-hop Question Answering on WebQ 2013 (test)
Loading...
48.3
F1 Score
CIRAG
41.332
43.141
44.95
46.759
Jan 11, 2026
F1 Score
Exact Match
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
Exact Match
CIRAG
Backbone=Qwen2.5-7B-In...
2026.01
48.3
34
MetaRAG
Backbone=Qwen2.5-7B-In...
2026.01
48.2
33.8
Dualrag-FT
Backbone=Qwen2.5-7B-In...
2026.01
47.4
31.2
Dualrag
Backbone=Qwen2.5-7B-In...
2026.01
45.1
29.5
Kirag
Backbone=Qwen2.5-7B-In...
2026.01
44.6
27.5
FLARE
Backbone=Qwen2.5-7B-In...
2026.01
42.3
26
Native
Backbone=Qwen2.5-7B-In...
2026.01
42.1
27.8
IRCOT
Backbone=Qwen2.5-7B-In...
2026.01
41.6
25
Feedback
Search any
task
Search any
task