Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long Document Retrieval on NaturalQuestions (test)
Loading...
59.98
F1 Score
AttentionRetriever
22.7272
32.3986
42.07
51.7414
Feb 12, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
AttentionRetriever
Backbone=LLaMA-3.2 3B
2026.02
59.98
AttentionRetriever
Backbone=Qwen-2.5 3B
2026.02
56.55
GritLM
Model Type=Dense
2026.02
45.92
SPScanner
Model Type=Autoregressive
2026.02
42.37
GTE-Qwen2
Model Type=Dense
2026.02
41.31
GTR
Model Type=Dense
2026.02
41.1
Qwen3
Model Type=Dense
2026.02
40.91
ANCE
Model Type=Dense
2026.02
38.18
BM25
Model Type=Sparse
2026.02
30.55
DPR
Model Type=Dense
2026.02
29.81
CDE
Model Type=Dense
2026.02
24.16
Feedback
Search any
task
Search any
task