Share your thoughts, 1 month free Claude Pro on us
See more
Home
/
Benchmarks
Retrieval on Bird
Loading...
96.1
Recall@10
Baseline
84.556
87.553
90.55
93.547
Mar 7, 2026
Recall@10
Precision
Recall
F1 Score
PR AUC
Updated 1mo ago
Evaluation Results
Method
Method
Links
Recall@10
Precision
Recall
F1 Score
PR AUC
Baseline
Model=stella
2026.03
96.1
-
-
-
-
DCTR
Model=bge-small, vote_...
2026.03
94.2
-
-
-
-
DCTR
Model=stella, vote_k=2...
2026.03
93.6
-
-
-
-
Baseline
Model=bge-small
2026.03
90.9
-
-
-
-
DCTR
Model=e5-small, vote_k...
2026.03
90.4
-
-
-
-
DCTR
Model=stella, vote_k=2...
2026.03
89.9
-
-
-
-
DCTR
Model=bge-small, vote_...
2026.03
89.3
-
-
-
-
DCTR
Model=e5-small, vote_k...
2026.03
88.2
-
-
-
-
Baseline
Model=e5-small
2026.03
85
-
-
-
-
ReAct
Backbone=Llama3.1-8B-I...
2025.01
-
15
96.7
24.5
93.5
ReAct
Backbone=GPT4o-mini, #...
2025.01
-
25.1
97
37.8
93.3
ARM
Backbone=Llama3.1-8B-I...
2025.01
-
42.7
96.5
56
92.7
Feedback
Search any
task
Search any
task