Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Long document retrieval on ConditionalQA (test)
Loading...
35.26
F1 Score
AttentionRetriever
12.6712
18.5356
24.4
30.2644
Feb 12, 2026
F1 Score
Updated 4d ago
Evaluation Results
Method
Method
Links
F1 Score
AttentionRetriever
Backbone=LLaMA-3.2 3B
2026.02
35.26
GritLM
Model Type=Dense
2026.02
32.58
AttentionRetriever
Backbone=Qwen-2.5 3B
2026.02
31.06
Qwen3
Model Type=Dense
2026.02
28.92
GTR
Model Type=Dense
2026.02
25.84
GTE-Qwen2
Model Type=Dense
2026.02
24.15
BM25
Model Type=Sparse
2026.02
19.88
ANCE
Model Type=Dense
2026.02
18.93
CDE
Model Type=Dense
2026.02
15.47
DPR
Model Type=Dense
2026.02
15.42
SPScanner
Model Type=Autoregressive
2026.02
13.54
Feedback
Search any
task
Search any
task