Our new X account is live! Follow @wizwand_team for updates
Home
/
Benchmarks
Factuality Evaluation on BIO (test)
Loading...
88.9
FS Score
CaLF
71.116
75.733
80.35
84.967
Jun 19, 2024
FS Score
Citation Recall
Updated 4d ago
Evaluation Results
Method
Method
Links
FS Score
Citation Recall
CaLF
Base Model=MistralOrca...
2024.06
88.9
92.7
Few-shot FT
Base Model=MistralOrca...
2024.06
86.1
69.3
CaLF
Base Model=MistralOrca-7B
2024.06
83.4
92.7
Self-RAG
Parameter Count=7B
2024.06
81.2
-
Self-RAG
Parameter Count=13B
2024.06
80.2
-
Llama 2
Parameter Count=13B, M...
2024.06
79.9
-
Few-shot FT
Base Model=MistralOrca-7B
2024.06
78.7
69.3
ChatGPT
Retrieval=w/o retrieval
2024.06
71.8
-
Feedback
Search any
task
Search any
task