Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
RAG-Completeness on ALCE (test)
Loading...
0.47
Mean Kendall's Tau
Jury-on-Demand
0.2204
0.2852
0.35
0.4148
Dec 1, 2025
Mean Kendall's Tau
Updated 4d ago
Evaluation Results
Method
Method
Links
Mean Kendall's Tau
Jury-on-Demand
Jury Strategy=Dynamic
2025.12
0.47
GPT-OSS-20B
Jury Strategy=Single J...
2025.12
0.4
Static (Avg-All)
Jury Strategy=Static,...
2025.12
0.38
Static (W-Reg)
Jury Strategy=Static,...
2025.12
0.34
Static (Avg-TopK)
Jury Strategy=Static,...
2025.12
0.28
Static (W-Tau)
Jury Strategy=Static,...
2025.12
0.23
Feedback
Search any
task
Search any
task